Fog Creek Software
Discussion Board




Transactions and Rollback Difficult?

From Joel's latest entry where he asks that other guy to provide a better example, it seems like his argument is just that doing transaction and rollback style operations is difficult using exceptions.

Now that I think about it, maybe that's right:

try {
  copyFiles();
  createRegistryEntries();
} catch (IOException e) {
  // How do I know which one failed?
} catch (SomeOtherException e) {
  // How do I know which one failed?
} finally {
  // Which operations were completed, so I can roll them back?
}

Is there an easy way to do this properly using exceptions?

Fred
Thursday, October 16, 2003

What you are trying to do isn't easy using any method.  Rollbacks in general take some work on the developer's part.

To make a general rollback, your exception handler should have a look for the results of all possible actions and undo any that it finds. If that isn't possible, what you need to do is modify your code to use a more fine-grained exception handling.

Note that my comments don't really depend on whether you're checking return values or catching exceptions. Which method you use is a stylistic choice. The problem to be solved is the same, only the method used to detect it changes.

Clay Dowling
Thursday, October 16, 2003

It seems like the non-exception example Joel posted would be easier than my exception example.

In Joel's example, he just checks after every operation. If I wrapped each method call with its own try catch blocks, I could do the same thing. That seems messy though.

Fred
Thursday, October 16, 2003

It doesn't matter which part fails. Your sequence of completed actions (say, MyTransaction) would contain those actions/functions which went okay (anything else should have rolled itself back, if you see what I mean), and these are the ones that need to be done. You could do the rollback procedurally (checking the return code of each function), but it would be horrendously complicated, far more so than calling MyTransaction.rollback() (which contains the successful actions, added there by the copyFiles and createRegistryEntries functions). You can even nest all that within itself and get an almost infinitely granular transaction/rollback system.


Thursday, October 16, 2003

Another thing that helps with this is not to let exceptions propagate up the call stack in many cases.  Instead, translate the lower level exception into a new exception that captures the semantics of what the exception means in the context of the higher level caller. 

Also, like some others have mentioned, if you capture the operations that you wish to include in the transaction as first class objects rather then as simple method/function calls, you can have the operations monitor their own state and respond appropriately to calls to commit, rollback, execute, and so on.

sleepy
Thursday, October 16, 2003

How is this messy?

try {
  copyFiles();
} catch (Throwable t) {
  deleteFiles();
}

try {
  createRegistryEntries();
} catch (Throwable t) {
  deleteRegistryEntries();
}

Scot
Thursday, October 16, 2003

To expand on Sleepy's comment above:

mtry {
  copyFiles();
} catch (Throwable t) {
  deleteFiles();
  throw new InstallationFailedException(t);
}

try {
  createRegistryEntries();
} catch (Throwable t) {
  deleteRegistryEntries();
  throw new InstallationFailedException(t);
}

Scot
Thursday, October 16, 2003

I think the proper way to do this involves the use of either deterministic destructors or finally clauses, depending on which your language of choice provides.

If you really need proper rollback support, the complexity grows linearly with the number of things grouped together as a transaction.  Basically, you do operation A.  Then you set a flag that says "operation A needs rollback on failure" Then you do some other operation B, then set the "B needs rollback on failure" flag.  Etc, etc.  Once you finish the transactions, you unset all these flags again (or set another "I'm done" flag.)

In the finally clause (or the deterministic destructior of a temporary stack based object), you check to see which, if any, operations need to be unrolled and do it.

Andrei Alexandrescu and Petru Marginean have written a c++ template-based solution to this problem utilizing the above.  It does some crazy things with templates, but it is a clever solution.  (The cuj site appears to be down at the moment, though.)

http://www.cuj.com/documents/s=8000/cujcexp1812alexandr/


Really, though, if you don't need full transactional support, it's more complicated than it's worth.  If you have some routine that displays a bunch of data to the user, and some deeply nested call throws an exception, you oftentimes don't care exactly how much of the routine you'd gotten through.  All you need to do is present an error message along the lines of "XYZ failed, please do ABC to fix it and try again".  (Where XYZ and ABC depend on which type of exception was thrown, not which piece of code threw it.)  When the user trys again, you'd be starting at the beginning of the routine anyway.  That to me is the real power of exceptions.  When you DON'T need fine-grained knowledge of exactly where you were when you failed, it allows you to greatly simplify the error handling logic and not lose any errors in doing so.

Michael Kale
Thursday, October 16, 2003

> Is there an easy way to do this properly using exceptions?

Uh, how about:

// Or handle the exception in this method for common
// cleanup
public void doInstall() throws InstallationException
{
  doCopyFiles();
  doMakeRegistryEntries();
}

public void doCopyFiles() throws InstallationException
{
    try
    {
        copyFiles();
    }
    catch (IOException ioe)
    {
        deleteCopiedFiles();
          throw new InstallationException();
    }
}

... the rest is obvious.

No,Joel, I already have a job.

The point about doing the recovery (ie, the implementation of deleteCopiedFiles() and deleteRegistryEntries()) is still well taken, but IMO is part of the problem domain: you are trying to mimic transactions here, and of course it's a bit difficult.

Portabella
Thursday, October 16, 2003

Scot, you're right, that doesn't seem too messy. I think Scot's method is what I would use if I had the need.

Fred
Thursday, October 16, 2003

Scot - your method doesn't delete the copied files with the registry portion fails.  You could add that to the clearnup of the registry portion but this would get extremely messy and unmaintainable when the number of steps increased.  How about something like this:

vector<CInstallOption*> l_vOptionsToRollBack;
try
{
  CInstallOption *l_pFileInstall = new CFileInstall();
  l_vOptionsToRollBack.push_back(l_pFilesInstall);
  l_pFileInstall->Install();

  // repeat for each step of the install
}
catch(...)
{
  // for each item in l_vOptionsToRollBack, call RollBack method
}

You'd have to clean up the memory in the above example, but it seems cleaner than putting unique rollback logic in for each failure.

Brad
Thursday, October 16, 2003

int DoSpagetti(int a, int b)
{
    if(!DoThing1(a)) goto rollback_1;
    if(!DoThing2(b)) goto rollback_2;
    if(!DoThing3(a+b)) goto rollback_3;
    // success!
    return TRUE;
rollback_3:
    DoRollback2(b);
rollback_2:
    DoRollback1(a);
rollback_1:
    // failed
    return FALSE;
}

couldn't resist, sorry ;-)

i like i
Thursday, October 16, 2003

Best code I've seen yet!

?
Thursday, October 16, 2003

Well,

It seems to me the best solution to this problem is outlined in a previous post:

http://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=show&ixPost=78077&ixReplies=10

Maxime Labelle
Thursday, October 16, 2003

try {
  UpdateRegistry();
  bDoneUpdateRegistry = TRUE;

  CopyFiles();
  bCopyFiles = TRUE;
}
catch (InstallOpException e) {
  if (bDoneUpdateRegistry) {
    UnUpdateRegistry();
  }

  if (bCopyFiles) {
    UnCopyFiles();
  }
}

Code looks a bit messy but can be cleaned up by having a shared status section. You get the idea.

robtwister
Thursday, October 16, 2003

i like i, in fact your code is not as bad is it is believed to be. Actually this is exactly the example where usage of goto is justified. If the function is not 10 pages long, the logic is pretty visible and is easy to understand.

Passater
Thursday, October 16, 2003

The prick waving about rollbacks on installation is getting a little excessive. This problem is relatively simple with just a little thought given to design, no matter the language you use.  If, however, it's giving you a problem, I'll be happy to help you out.  Just forward your contact information, or better yet your immediate superior's contact information, and I'll be in touch. 

Clay Dowling
Thursday, October 16, 2003

Brad,

your vector of items to rollback is very neat.  It is kinda equiv to the Symbian 'CleanupStack', where you can push things.  At the end of your function you pop things (in proper order, extra error checking).  If the function 'leaves' (the Symbian exception mechanism) then items on the cleanup-stack are automatically rolled back.

Symbian has three obvious ways of handling errors:

* Things expected to fail, e.g. CopyFiles, where app logic is expected to recover from, return an result code.  The return value is always a result code, e.g. KErrNone or KErrBadFilename or whatever, and never returning info to the app logic.

* Unexpected exceptions, fairly serious, 'leave'.  This is the Symbian equiv to a (low overhead) exception.  You get very little error info.  All functions that might leave have an 'L' suffix to their name, making it easy to keep track of.  You do have the equiv of try-catch (termed a 'trap') but it is really precisely for tidying up, so a very high grained recovery.  For example when memory cannot be allocated.  I've already mentioned the useful CleanupStack mechanism for automating deallocating heap stuff (and other uses) in case of failure.

* Fatal errors (usually programmer induced, like passing an invalid handle or a GPF or something) result in a 'panic'.  There is a chance to add a 'on panic' listener for tidy up to a thread, but you can't stop the thread dying.  It is usually kernel-side code that panics a thread.

i like i
Friday, October 17, 2003

> This problem is relatively simple with just a little thought given to design

I have to disagree.

The simple case, which works most of the time, *is* simple: you just delete the files that you've copied, the registry entries that you've made, and any other operations which need to be undone. (And parenthentically this is why I think installations should mainly be file copies).

Problems occur when your rollback operations themselves fail.

If you've decided that having stray files, registry entries, etc on a few relatively rare occasions isn't a problem, then stop here: you really are done.

If you really need the system to return to the exact same state as before installation every time, then, as I said before, you are essentially implementing a transaction, and you need to analyze all possible cases where the system might fail.
For instance, the system might crash before removing the registry entries, so you'll want to have a persistent record of which entries to remove. And so forth.

Still not rocket science, and a well-studied problem to boot, but quite a bit gnarlier than the basic best-effort undo.

Portabella
Friday, October 17, 2003

*  Recent Topics

*  Fog Creek Home