Fog Creek Software
g
Discussion Board




Error handling - why at all?

http://weblogs.asp.net/oldnewthing/archive/2004/04/22/118161.aspx#FeedBack

Valid points. But why handle exceptions at all? If one codes with strict requirements and validations, a la DBC, why not just let the app die after dumping the state to a log? Or is exception handling by definition a validation routine?

KayJay
Friday, April 23, 2004

Great idea. I should just create perfect code without bugs and take into account any situation that may ever occur, then I don't need exception handling.

On a more serious side: A reason not to let an app die is that most end users not really appreciate it. One should allow for plan B if possible. Even if something really really bad happened they probably still want to save their changes. This is where exception handling comes in really handy.

IMHO both exception handlers and error codes have a raison d'être. In some situations error codes are better in others exception handlers are preferred. Saying that one or the other is evil is another, way too black or white, discussion.

I guess I don't think equal to the great minds.

Jan Derk
Friday, April 23, 2004

Actually it is an intersting question.

I am only just learning about exception handling of the last few months.

I mean I kneow it existed but always thought it was spaghetti code type stuff. An exception stops code and then jumps to another section, this is very like the old days of 'goto' in BASIC. I always, and still do, hate the thought of exception handling for that reason. Same reason I do not like putting in 'end loop' statements as a means of jumping out of a loop.

This is not to say that I have not been gaining an appreciation for why they all exist, just that they irk me.

Aussie Chick
Friday, April 23, 2004

OK, In my world (databases) it's quite common to not be able to complete an update on the database because some resource is temporarily locked -- especially in high-volume applications.

I have a choice:

(a) crash the app, core dump, and screw the user.

(b) catch the exception, wait some random number of milliseconds for the resource to be unlocked, retry and complete the transaction -- all invisible to the user.

Which do you think the user would prefer?

"Error handling -- why at all?" That's a damned silly question from a professional developer.

Sgt. Sausage
Friday, April 23, 2004

There seem to be a few things mixed up here. I thought exception handeling vs. return codes is a matter of style (separting different execution paths  in the source file), and has nothing to do with handeling errors or not. People argue and have differnt preferences over which style is clearer, but they do not argue for not dealing with errors at all.
There are recoverable errors, in which it makes sense to retry or try a differnt approach, and unrecoverable errors, where the whole process needs to be ended because continuing just has no meaning.
You should never let an app "just die". At least you should try to free the recourses you claimed and then exit gracefully.

Just me (Sir to you)
Friday, April 23, 2004

KayJay, I believe the opposite; languages need to be more organic and have much more developed "unusual situation" facilities.

Apparently Multics had great ideas to mine.  They found their way to Kent Pitman's "Conditions System."

I'm not saying languages should be sloppy and ambiguous.  Instead, they should have power in the direction of handling unusual code paths, which normal functions are not that great at handling.

Tayssir John Gabbour
Friday, April 23, 2004

"I mean I knew it existed but always thought it was spaghetti code type stuff. An exception stops code and then jumps to another section, this is very like the old days of 'goto' in BASIC."

If you apply that logic to exception handling, then you must also apply it to object-orientation because it's the same principle of code execution not following a sequential path but jumping about instead. Is that what you think?

John Topley (www.johntopley.com)
Friday, April 23, 2004

"I mean I knew it existed but always thought it was spaghetti code type stuff. An exception stops code and then jumps to another section, this is very like the old days of 'goto' in BASIC."

That statement is a bit misleading.  An exception does cause the code to jump, but in a very predictable manner--back up the call stack.

"If you apply that logic to exception handling, then you must also apply it to object-orientation because it's the same principle of code execution not following a sequential path but jumping about instead. Is that what you think?"

A bit off topic, but I think that OO style code, which jumps from object to object modifying variables (state), is the cause of many bugs.  Variables in an object are just globals on a smaller scale.  It's similarly difficult to debug code that has multiple methods modifying variables as code that has multiple blocks modifying globals.

Matt
Friday, April 23, 2004

Should your browser just exit when you make a typo?

son of parnas
Friday, April 23, 2004

I was prompted to think so because of the following, among other similar broken and confused thoughts:

1) A premise that there is only _one_ correct way the application functions, or to be precise, a _finite_ set of *correct* ways the application will recieve input, process the input and generate output. Consequently, that set can be explicitly indicated in the code. A corrolory; An _infinite_ set of *incorrect* ways exists. Therefore, anything that does not match the set of permitted parameters is, by definition, unacceptable and hence not required by the aplication. Should such a situation be encountered by the Application, it should cease to function or else (3) raises its ugly head.

2) A premise that all validation of input is done before processing the input, not generating an exception when the processed ouput does not match the required specification.

3) OO techniques, a pre-requisite for most non-trivial application, viz. encapsulation, inheritence, untyped variables, etc.  ensure that there are no systemic (though, for the programmer/maintainer, it is quite the opposite) leakages between child and grandchild modules deep into the modular structure of the parent module(s). This gurantees that all errors propogate down the hierarchy in all their entirety.

4) Of course, the above does not take into consideration external resources, such as network availablity or database access. Granted that since those other applications raise errors, one is forced to raise errors within one's own application to maintain a mutually understandable communication channel among them. But then again, why cannot a database library return state information instead of raising errors? If a database is locked, a check for IsStateLocked may added, if a database is unavailable, the library may return a state information indicating so, so that one may be able to code something like,

............IsDBAvailable(Resource_Location)
...............OpenDB()
...................IsDBAccessible
.......................ProcessDB(With_This_Data)



Regards

Kaushik Janardhanan

KayJay
Friday, April 23, 2004

> so that one may be able to code something like

That doesn't work if the database becomes unavailable, aftter you call IsDBAvailable but before you call (or, while you are calling) OpenDB.

Christopher Wells
Friday, April 23, 2004

In this discussion I miss the distinction between exceptions and violations. Violations are when the assumptions of the programmer appear not te be correct. In that circumstance, any form of handling is a form of uneducated guessing. How can one safely repair the program's condition if one obviously has lost track. Violations are raised by contracts. Exceptions are less serious and rather common, i.e. broken connections, printers out of paper, locked files, etc. These are conditions that are not usual, but they should be anticipated. Exceptions are rather common and should be handled. Handling violations usually is not a good idea, unless your software controls a mission critical device for which a mem dump is not an option. E.g. if your software controls a space craft reentering the atmosphere, then guessing can be better than shutting-down, because the latter has a 100% chance of disaster.

In our company we make our software as brittle as possible. We hardly have any exception handling mechanisms in our code base. Let it break, let it break, let it break! False assumptions are shown instantly, and they are not concealed by repair code. Nice side effect: we do not have to write repair code. Repair code is often buggier than the code it must repair. The programmer is already in a difficult program state, the code is difficult to test, and coverage is a serious problem.

So if our software breaks, we remove the bug, rather than add repair code.

Incidently, three days ago I wrote an article about us doing away with debugging software, which is closely related to this discussion. If I break instantly you do not need a debugger:
http://www.hello.nl/articles/TheDebuggerisdeadlonglive.html

KAreel Thönissen (www.garabit.nl)
Friday, April 23, 2004

IsDBOpened()

That would be the case right down to an UPDATE query. Hence Transactional Programming. If the DB is available, open it, if opened, then access it, having accesed it process it.

It may sound pedantic, but, each action is based on a pre-determined favourable state, such state being known by querying for it.

Rasing errors is like saying "I do not know what you want, but this is what I can give". I do not want to anything other than what I have asked for, and I can take a "No" as an answer.

KayJay
Friday, April 23, 2004

Notice the typos, some of them rather funny: 'a programmer in a difficult program state', 'I break instantly', and a typo in my own name. That is the consequence of not using a mental debugger (-8.

Karel Thönissen (www.garabit.nl)
Friday, April 23, 2004

>False assumptions are shown instantly, and they are not concealed by repair code.

Precisely. And thanks for the link.

KayJay
Friday, April 23, 2004

==>IsDBOpened()

=>That would be the case right down to an UPDATE query. Hence Transactional Programming. If the DB is available, open it, if opened, then access it, having accesed it process it.

Malarkey!

Done much database programming in high-volume/high-transaction environments? Didn't think so.

Do you understand "Transactional Programming" ? Didn't think so.

IsDBOpened? Yes, proceed with UPDATE.

In what makebelieve world is the following block of code an atomic transaction?

If IsDBOpened() Then DoTheUpdate()

There is a small, but finite time, between your call to IsDBOpened() and the call to DoTheUpdate() where, for whatever reason, the DB may be closed. Maybe, by coincidence, a hub went down in between the two calls.

When you get to the actual UPDATE, "Transactional Programming" will likely (most platforms) throw an exception or return an error code if it fails.

When UPDATE is processed, and it fails, you've got as far as I see it 2 options:

(1) don't catch the error / exception, let the application core dump, GPF, ABEND, or whatever it's called on your particular platform.

Unacceptable to users.

(2) catch the exception. This requires proper error handling code --  proper handling of the exceptions. From here, you've got a multitude of options:

(2) (a) Roll back the transaction, throw up an error message to the user. Maybe s/he'll retry the transaction.

(2) (b) Retry the transaction without user input. Maybe it will work this time as the locked resources may no longer be locked.

(2) (c) Do nothing -- eat the exception thrown by the DB and don't notify the user : unacceptable for *most* applications, but acceptable under some scenarios.

The point is, for all of the acceptable scenarios, you've got to:

(i) Detect the error. You can't do this if you simply ignore error handling.

(ii) React to the error. *Handle* the error. You can't do this if you don't first detect the error.

Error handling is *required* in my world (databases). To not handle them is absurd. In my world, if you're not properly dealing with errors/exceptions, you are an amature and don't deserve your paycheck.

==>Rasing errors is like saying "I do not know what you want, but this is what I can give".

Who says they have to be raised ? Who says they have to be presented to the user? Take a look up at your title. It says "Error *handling*" [emphasis added]. The errors certainly have to be handled. You may be able to handle them without the user ever even knowing. EXAMPLE: Most of my applications go to perform a DB UPDATE. If the UPDATE fails due to locked resources it goes through the following:

(1) Wait a random number of milliseconds between MinWait and MaxWait (application level configuration settings). This assumes that most locking contention issues are short lived, and after a few milliseconds the locks may be released.

(2) Retry the UPDATE.

(3) If successful, the user never knows that it was tried again.

(4) If it fails, goto (1) Wait, then (2)  Retry ... and continue to retry until MaxRetries (another application level configuration setting).

(5) If we exceed MaxRetries, notify the user and ask if s/he wants to attempt the UPDATE again. If not, abort and cleanup (with possible ROLLBACKs, depending on the situation.) If so, goto (1) and start again.

Believe it or not, this *handles* most locking contention issues in a DB world, without a user ever even knowing the update was tried multiple times.

****
QUESTION: Would you deposit your hard earned money in your bank account if you knew the bank's IT department took your same cavalier attitude about handling errors? -- OOPS! Sorry Mr. KayJay, we have no record of your payroll deposit -- see you next payday!

What about stepping in front of the X-Ray machine at your doctor's office if you knew the software that controlled the machine was programmed by someone who didn't care about error handling?

How about hopping on a plane where the programmers in the flight control group didn't care about handling errors?

Granted, these situations involve programming in a different world than most of us live in, but I deal with hundreds of millions of dollars in health insurance related claims. If I don't handle errors, people don't get paid. It's not pretty for anyone involved when the State Insurance Commissioner gets involved 'cause someone didn't get their check on time. Medical bills can and do put people in bankruptcy -- all the time. It's absolutely essential in my world to handle errors appropriately, and make sure the checks get cut and sent on time.

You should anticipate potential errors, check for them when necessary (check error codes, return codes, catch exceptions, whatever model your tools support) , and handle them in whatever way is appropriate for your domain.

Sgt. Sausage
Friday, April 23, 2004

Guess I have miscommunicated. Again. Nothing unusual, but no less irritable. To both of us, I must admit. My most sincere apologies.

Have I done any enterprise level DB apps, on the scale I understand you to question my experience with? No. Have I done any sensitive information processing with DBs? Yes.

I am not complaining about having to handle errors. I am also not advocating abandoning error handling. What I am arguing for is that, given,

1) Presence of discrete sequences in computational processes.

2) Capacity to communicate boolean information across various modules in a given system.

3) Facility to hold state information for evaluation at some arbitary time in the future.

I find it difficult to comprehend why a fixed set of pertinent Yes/No questions are not asked before acting on the results of such questions.

Let us presume that the sum total of my app is confined to itself. No external interaction, but for keyboard input and screen output. The app is built contractually. Each action requires a certain kinds of keyboard input, which are then processed and the results are printed onto the screen for the user to act (or not) upon. Will my "cavalier" attitude, as you would have it, generate any remarkable inconvenience to the user?

In the non-hypothetical situations, unlike the one above, catch-all situations are implemented where unrequested information is passed along with a context indicating failure. Yes, Oracle, SQL Server, DB2, the NT FileSystem, all are extraneous but vital for my Application. Hence I have to handle any errors based on prior knowledge that errors are raised by those entities. Should they also funtion contractually, both internally as well expose state information to my Application, well.....

KayJay
Friday, April 23, 2004

==> I am not complaining about having to handle errors.

Funny. That's not how I read the thread's title and
your original post.


==> I am also not advocating abandoning error handling

O.K. I'll assume that's where your coming from on
this.

I misunderstood and thought you were aiming to
abandon any/all error/exception handling.

My bad.

==> Let us presume that the sum total of my app is
confined to itself. No external interaction, but for
keyboard input and screen output.

That's a remarkable presumption. What type of system
is that? Not a very useful one. I'd concede that
yes, indeed, you can completely abandon proper
error handling if you're working on a useless
application < VeryBigGrin ;) >.

Sgt. Sausage
Friday, April 23, 2004

As an aside (mostly a reply to Joel's original Exceptions article), the Linux Kernel has an interesting rule.  If you have a block of code that is initializing something, and some of the steps may fail, you use  system something like this:

int x, y;
if (!init(x))
  goto x_failed;
if (!init(y))
  goto y_failed;

return 0;

y_failed:
  uninit(x);

x_failed:
    return -1;

For initialization steps that may involve 4 or 5 parts, you end up with 2 exit points in the function, one successful one and  one error one, they are both at the end of the function, and it's very simple to unroll the failed initializations in the opposite order.

In my opinion, this is how to handle these kinds of situations while keeping your code readable.

Ryan Anderson
Friday, April 23, 2004

Some people don't grasp pointers, some don't grasp templates, yet some others don't grasp exceptions as can be easily seen from many posts and articles on this site. To title any of these categories as 'great minds' is, well, an oversimplification.

coresi
Friday, April 23, 2004

Hmmm.......... speaking past each other.

OK, one more attempt before shutdown and shuteye.

For situations when one employs exceptions and/or error-codes, could you think any other method(s) of overcoming and/or resolving them?

KayJay
Friday, April 23, 2004

Sgt. Sausage rules. Personally, I believe in trust but verify. I work on web systems. Most of my cohorts rely on javascript validation if anything. In my code I question everything and I have warnings and dies. In both cases I send myself an email. I modify my code to avoid getting future warns and dies. I then deliver the user to a previous state if possible with a message that pulls from a config file for a business analyst/writer to write or I die with a similar message (not my language but one written with the user in mind). My goal is to get as few warnings and dies as possible. Over time I refactor my code to achieve that. Works for me and I hope the users appreciate it as I code for them.

must remain anonymous
Friday, April 23, 2004

==>For situations when one employs exceptions and/or error-codes, could you think any other method(s) of overcoming and/or resolving them?


No. I'm not that smart. If I were, I'd be writing compilers for a living instead of processing claim records in the database <grin>

******

In an earlier post, you mention the following:

............IsDBAvailable(Resource_Location)
...............OpenDB()
...................IsDBAccessible
.......................ProcessDB(With_This_Data)

Which is downright silly -- These are not atomic
actions. Just because the DB IsAccessable when you call IsDBAccessible(), doesn't mean it's still accessable by the time your code gets around to executing ProcessDB().

What happens when:

.....IsDBAvailable(Resource_Location): TRUE
........OpenDB(): TRUE
............IsDBAccessible: TRUE
................ [Router goes out to lunch here] <OR>
................ [DB RAID Array fails here    ] <OR>
................ [Power supply fries on server ] <OR>
................ [Your session killed by DBA  ] <OR>
................ [Some DB resource is locked  ] <OR>
................ [Whatever                    ] <OR> ...
................ProcessDB(With_This_Data)


Just because you checked it a couple of nanoseconds ago, doesn't mean the resource is still available by the time you execute ProcessDB()

Unless ProcessDB notifies you somehow: return an error/sucess code, raise an exception, send an error event to some event sink, <whatever> -- if ProcessDB() doesn't notify you of an error, you have no way of knowing WTF happened With_This_Data. Did the update succeed? You've no idea because you have no return code, or didn't catch the exception, or weren't listening for error events or <whatever>.

I'll ask you the same: How would you have ProcessDB notify you that the router just died? Right now, the standard methods include (a) returning an error code/error object, (b) raising an exception, or (c) raising an error event. I've not seen anything else in my carreer, but maybe I'm just not educated enough on these matters.

How would *you* do it?

Sgt. Sausage
Friday, April 23, 2004

Sgt. you sure like nailing a nailed nail.  Point taken.

And I did not intend climbing up that tree. Rather, I wanted climb all the trees. If all components of a system, an App, the DB, the Network, et al, maintained a state table of their current state, at any given point in time, it would really save a lot of headaches, including the one this thread gives you ;)

KayJay
Friday, April 23, 2004

==>Sgt. you sure like nailing a nailed nail. 

If all you've got is a hammer ... <grin>

==>Point taken.

Obviously not. See below.

==>If all components of a system, an App, the DB, the Network, et al, maintained a state table of their current state, at any given point in time, it would really save a lot of headaches

Nope. Not as stated. You'd *still* have the exact same headache I've been trying to hammer home.

As given, a simple state table (whether in memory or elsewhere -- you didn't specify) would simply *not* work for the same reasons in my above post.

So you check the state table and it says the DB is available. So what. By the time your code gets to executing the the next step, actually updating the DB, maybe it's not available (see my earlier post for possible reasons). How would you know? You just checked it and it's available, but now it's not. Your ProcessDB function would assume all is well because you just checked it, when in fact it's no longer available.

What you have submitted as a solution to your headaches has, in fact, done nothing for your headache.

You still have the same issue(s). ProcessDB *must* report back to you in some manner. Did it succeed or did it fail?

You might argue that ProcessDB can update the above mentioned state table to let you know it's failed, but I'd argue that that's much more difficult than current accepted methods of detecting errors. You'd still have to check your state table after *every* call for EverythingThatMatters to make sure it didn't fail. How is that different than any easier than checking a return code, or catching an exception after every call?

I'll keep hammering ...

Sgt. Sausage
Friday, April 23, 2004

BTW:

Have you come to understand yet, after really thinking about it, how much of an ImpossibleDream the following is:

==> If all components of a system, an App, the DB, the Network, et al, maintained a state table of their current state, at any given point in time

and just what it would take to create such a monster?

Sgt. Sausage
Friday, April 23, 2004

(It's the morning after and there is more blood in my alcohol now!)

Sergeant, I use screwdrivers more than hammers. And honestly, I do take/concede the point.

The current app I am working on involves considerable mathematcal functions that operate on arbitary textual and numerical content of varying length, whose result is a simple display or at most a disk file. While working on validation for the input, I came across Mr, Chen's article and thence this post. One particular sequence of operations had to conform to  strict requirements on the posible length of the input and the limited numerical types permitted, as a lot of Bit-Masking was implemented in the algorithm (FWIW, this is a quasi-crypto app., one of the "useless" but simple Input-Process-Output, where the Processing is 95% of the work.)

While the above was being fleshed out, it struck me that since the algorithm was a given, a 100% conforming to the input paramaters ensured that there was no need to raise errors in the app and consequently no necessity to handle any errors. There were no situations where a resource would be unavailable - RAM, Processor, Power Supply - and if any of them fails, the app anyway cannot function, so any rollback code would be useless.

Once again, my apologies if I have blown off my top.

Regards

Kaushik Janardhanan

KayJay
Saturday, April 24, 2004

*  Recent Topics

*  Fog Creek Home