Fog Creek Software
Discussion Board




So how should you rewrite?

Joel makes a very strong argument for never rewriting from scratch.

So how should you go about rewriting code, because sometimes it is necessary.

We face this problem ourselves.  We have a working system but it was written on the learning curve.  The early stuff is very poor and will need replacing.

Rather than rewriting from scratch, what strategies would you recommed for bringing our code up to standard?

Ged Byrne
Monday, December 03, 2001

Can you rewrite it incrementally?  If at least it was built in sections, you can replace things section by section?  That way, as inevitable bugs occur, you can localize the bugs to the section you happen to be working on.

Hopefully, you kept the same coders, so there's a knowledge of what "you'd do next time."

BTW, refactoring is all about this.  It's just the idea that you're doing only one of two things at any time:
- Transforming code to a functionally equivalent form.
- Extending it for new functionality.

And those transformations are fairly well-defined in a $50 book.

You probably want a very small team to do this, at least at first.  That way you can learn about the problems and have a good idea of the territory.

Richard Jenkins
Monday, December 03, 2001

Are there any good texts about how to refactor code?

Everything seems to be geared towards building a new system.  I suspect this is one reason that programmers like to build from scratch - its the only way they've been taught.

Ged Byrne
Monday, December 03, 2001

Here's the Amazon page for the canonical book:
www.amazon.com/exec/obidos/ISBN%3D0201485672/102-1265738-7428166

where you can read about how it does peoples' dishes and brings world peace... ;)  There's also:
www.refactoring.com/

Richard Jenkins
Monday, December 03, 2001

Links redux:

http://www.amazon.com/exec/obidos/ISBN%3D0201485672/102-1265738-7428166

http://www.refactoring.com/

Richard Jenkins
Monday, December 03, 2001

I too believe that incremental refactoring is the way to go. What you're experiencing has got  to be very common on large software projects, when the developers care about the code base, although I want to bring up some problems.

Inevitably, the net result is a code base which does similar things in several different ways. How do you ensure that everybody on the team starts doing things the "new" way?

I'm wondering if anyone maintains an "implementation guidelines" document which describes current practices, and if anyone would even read it.

And how do you go about changing these practices when the company VP, who spends only about 10% of his time near the code, who is also the team lead, thinks that the code doesn't need refactoring, and that we should only spend our time fixing bugs (!) and cranking out new code as quickly as possible.

B
Monday, December 03, 2001

Richard,

Thanks, this is exactly the type of thing I was looking for.

Ged Byrne
Monday, December 03, 2001

B,

This is exactly our problem.  Each programmer has their own method for doing common tasks.

I think this is made worse in web apps because you spend most of your time coding 'clever' workarounds.  Programmers are particularly proud of their bag of tricks and reluctant to change.

Ged Byrne
Monday, December 03, 2001

That's what standards are for, and code reviews to enforce them.  If a programmer's bag of tricks are good enough, then they should be incorporated into the standards (which is why standards written by ivory tower architects tend to fail; too little attention to practise); if the programmer won't play nice, well, there are plenty of other programmers out there.

Also, documentation, in code and out of it, about design decisions made when refactoring are important, so people understand why some code is new and some is old.

Incremental changes can yield good results, too.  I've recently quadrupled performance in a client's web app just by changing a few key areas, and it was accomplished with a few weeks of my time, not the months of many programmers which would be required by a ground-up rewrite.

When the client moves to fix shortcomings in the data model, we'll do it by changing the DB structure, slapping views on to simulate the old schema, and migrating code in order of importance for functionality and performance, and document what's changed so new programmers understand why some code refers to old structures.

Rodger Donaldson
Monday, December 03, 2001

So let me get this straight so far:

1) Identify best practice and document it into standards, producing a standard 'bag of tricks.'  Would the Perl Cookbook be a good example of what  to aim for?

2) Document everything.

3) Establish interfacess and ensure that these are respected when changes are made. 

Create a new interface if required and support both until all code has been refactored.   

Do you find that creating unit tests for these interfaces are useful.  I worry that too much time might be spent trying to get the test code to work.

Ged Byrne
Tuesday, December 04, 2001

We have rewritten twice before and were better for it. We now need to do so again.

The main argument against rewriting given in the interview is the time spent doing this while your competitors 'eat your lunch'. There are a number of things we have done and can do during rewrite to ease the pain:

1. There are at least two stages where something sellable can be produced from and during the development towards the full rewrite.
2. There are spin off benefits for the existing software from the rewrite. Things that we can take across to make new features in the old software. They get tested earlier then as well.
3. We have features in the bank that currently cannot be accessed from any menus to keep us going.
4. We have recently put a lot of new features into the existing software that give us unique selling points.
5. We have already started work on it some time ago during a rare lull to get the ball rolling.

Also there are good reasons for us to rewrite:

1. We now need to incorporate a whole new set of features. Doing this makes the rewrite half way towards a new product requiring a rethink on all existing features and how they will work with the new features.
2. We stay ahead of the competition by putting in features they haven't thought of and keep up with them by putting in the features they have. We keep customers and gain repeat business from them by putting in the features they've asked for. It's part of the support they pay for. We gain large contracts from putting in the features a potential client says needs to be in the software before they buy. It is taking longer and becoming more error prone to continue doing this. We need better-designed code to continue to do this. If we don't then we slowly grind to a halt as a business. We could address these issues by renovating certain areas but in doing this you have to take into account how this will effect the rest of the code - especially when you consider the state the code is now in. When the number of areas that need renovating include most of the software then it becomes quicker to rewrite. Perhaps it's the price we have to pay for speed.
3. We need a complete rethink on how all the features work together due to the accumulated affect of the continuos ad hoc addition of all these features.
4. We need certain parts of the program (interface code, data storage code, certain feature groups) in their own modules so that we have a more flexible code base. This will help us better meet individual customer needs - so that everybody doesn't get all the features one person has asked for and nobody else wants. Also so that we can reuse parts in other software. And also so that in future we don't have to completely rewrite the whole thing again.
5. There are certain changes which need to be made that would effect everything - using a proper database that other software can read instead of the rather eccentric one we wrote ourselves some years ago. The data logic is scattered across most of the code. I have considered having a gradual program of first enscapulating all of the data storage logic since if we are part way through this work we could still release and nobody would notice. Once completed we can then change how data is saved and loaded by only changing one place assuming the interface to the data logic doesn't need to be changed. However, a lot of work would need to be done that otherwise wouldn't writing something that will convert 'read this bit from that file in there' (i.e. how we get and put stuff into our 'database') to an SQL query.
6. So that if I get run over by a bus or am the next one to suffer a nervous breakdown here a new programmer can glide straight in!

Joel says, "It's not like code rusts if it's not used. The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they've been fixed. There's nothing wrong with it."

Old code might not rust but it does get bent out of shape if it is being constantly changed and it is absurd to think that you can keep bending it for ever out of all recognition of what it was originally designed to do.

Another argument given by Joel for why a complete rewrite is not necessary is that code is necessarily messy. What makes it messy is all the bug fixes. Bug fixes you will loose when you rewrite.

1 The three bug fixes cited by Joel represent a minority category of bug fixes in my experience - to do with external circumstances. The majority of bug fixes in bad code are the result of bad design. For example: forgetting to apply a patch in all the places it now needs to be applied or changing a function to fix one place where it is called and not realising that it is called to do other things in other places etc. Most of the messy fixes wouldn't need to be there in the first place if the code was better designed and so don't need to be in the rewrite.
2 We are not throwing out that information, all the bugs we have fixed are logged in our database categorised by feature area. When we build the equivalent new feature we look at the equivalent old for a number of things. One of which is what has been fixed.

The arguments made in the interview are against completely starting from scratch and not using the old code for anything. That is not the way to rewrite! We are not proposing that we should start from a complete blank sheet.

1. There are algorithms in the old code we can bring across - e.g. the tricky mathsy stuff.
2. There are a lot of good feature ideas, all of which should be brought across, improved and better integrated and being made more consistent with other features.
3. And there are the bug fixes we need to take across recorded in our database.

Joel also cites some examples of rewrite disasters.

1. The Netscape example was more a case of a new set of programmers coming in who did not have the patience to understand the existing features and code. This is not applicable in our case.
2. And of course there will be disasters, I do not assume that in every case of a rewrite it will necessarily be a success so of course there are disaster examples. We've done it before, twice (or three times if you consider a certain competitor who used to work for us). We couldn't have continued if we hadn't.

Joel has worked for Microsoft (a completely different universe to us) and is a 'name' in software. So he is probably only used to working with top talent and has probably never seen code as bad as ours. At Microsoft, if they need more programmers to keep something going, they probably clone them at their secret laboratories. Our clueless controllers have to make do with us for fear of something worse, and believe me you could do a lot worse! We've hired them; they didn't stay long!

There hasn't been anything he has said that should dissuade us from rewriting. JoelOnSoftware.com looks like a very useful place and it at least touches on all the important issues so I have bookmarked it. But perhaps sometimes he writes more than he thinks. However, that someone who used to work for the god of software is so dead set against rewriting has knocked my confidence in this decision. We can look at the things we are trying to achieve by rewriting and see how close we can get in a year without a rewrite. I haven't looked at the links posted in this thread yet either to see if there is anything there to change my mind. I'm in work tomorrow so will see what everyone else has to say.

Edvard Kardelj
Tuesday, December 04, 2001

The trick, I think, in this is to continue to maintain the existing product line with incremental features whilst at the same time having a competitive design and development process happening in parallel.

This is a difficult balancing act.  Those working on the older product line can become de-motivated thinking there is no future in what they are doing.  Customers can delay their buying decisions because they get to know about the competition too early.

It is possible to achieve though and achieve it in such a way that the replacement of the competitive development provides its own impetus and generates upgrade/crossgrade sales that more than make up for the added development costs.

One of the benefits of this competitive development is that those working on the 'old' code get a new lease of life challenging the new product design by adding very complex features to their stable platform.

It can even turn out that the competitive design dies because the old code is proven to be workable and adapts faster to the changes prompted by the competition. 

Simon Lucy
Tuesday, December 04, 2001

Simon Lucy wrote:
'The trick, I think, in this is to continue to maintain the existing product line with incremental features whilst at the same time having a competitive design and development process happening in parallel.'

Thank you Simon. You’ve summed up into a single sentence exactly what we are trying to do. I shall be stealing that sentence in future next time I have to talk to a manager type!

Simon Lucy wrote:
'This is a difficult balancing act. Those working on the older product line can become de-motivated thinking there is no future in what they are doing.'

The line between the old and new code is blurred so that in effect everyone is working on new code. For example, there are some parts of the program that are used by everything and almost everything else uses ('the sump'). Then there are other areas on the outskirts that use and are used by very little. Such outlying areas would include reporting features. They only read the existing data in order to list, they do not write to it and perhaps activate the appropriate input form for a data item when the row that represents it is 2xclked on. We develop the new reporting features to work with the new and old code. When as much new code that can be shared is done we 'drop the sump' and replace that.

Simon Lucy wrote:
'Customers can delay their buying decisions because they get to know about the competition too early.'

We are letting key customers in early in the development process - releasing prototypes and functional specs that are just screens and show how the screens work together. The theory being that if we involve our most enthusiastic customers in this way it will generate interest and repeat business and what ever they get in return for doing this will be outweighed by greater sales. In many of our larger client companies the software seems to be confined to small enthusiastic groups - the idea is to get it to spread across the entire organisation. One of our clients is convinced that our software will 'take over the world!'

Simon Lucy wrote:
'It is possible to achieve though and achieve it in such a way that the replacement of the competitive development provides its own impetus and generates upgrade/crossgrade sales that more than make up for the added development costs.'

Yes, one of the things we are leaning is that the larger the client company the more important the software architecture is. What database it uses. How well it integrates with the other software they have. Can it be made to save and load to and from their existing databases instead of duplicating the data again with more software? Can they use their existing software to implement this part of the functionality and use ours for the rest by only using certain modules? It is quicker to rewrite than to address these issues in the existing code.

Simon Lucy wrote:
'One of the benefits of this competitive development is that those working on the 'old' code get a new lease of life challenging the new product design by adding very complex features to their stable platform.

'It can even turn out that the competitive design dies because the old code is proven to be workable and adapts faster to the changes prompted by the competition.'

I did broach the subject to the other coder that he could most of the maintenance on the old code in the car home (where these strategic matters are usually discussed) and he didn't seem to mind. But there didn't seem much point with only two of us and other products to support.

Edvard Kardelj
Tuesday, December 04, 2001

If you're designing a new product or moving to a different platform, sure go ahead and rewrite. And parallel development is the way to help your business stay in business until the new product is ready to ship.

But if you're going to need most of the features from the old application in the new one, and you're not changing your target platform, you're not spending your company's money well.

I think a major contributing factor to bad code quality is that when a developer or a team of developers decide that they need to rewrite the entire application, they completely lose interest in writing solid, readable code, which of course will support the idea of a rewrite.

Also, if you decide a rewrite is necessary now, you will do it again. It's possible that the next rewrite will be more architecturally sound (and it will certainly be fun in the beginning) - but you will not cease to learn and wish you had done things differently.

To get out of this situation you'll have to be specific about what the problems are with the code base. Most of the problems are unrelated and can be attacked one at a time.

Joel talked about an application's "user model" in one if his articles. The same goes for the source code. The developer is the user. If you're guessing where a particular feature is implemented, and you're right, the user model corresponds to the source model. Find the flaws in the source code model, and fix them one at a time. And never cease to put code quality above everything else.

Johannes Bjerregaard
Tuesday, December 04, 2001

"So how should you rewrite?". I guess the best answer is "don't rewrite - refactor". Martin Fowler's book ( http://www.refacoring.com ) describes the process in details.

Of course, it is not so easy to start refactor your existing 100000 LOC project. You should have unit tests for all classes, you should have functional regression tests.

We started to practice refactoring several months ago. At that point we already had an working application, that were developed for 3 years. First the team accepted a rule: don't change a code that have no unit tests. So in order to change any line of code that have no unit tests, they had to write it. So after some time we had enough unit tests to start moving forward. Currently I can say that we can make any design change without rewriting.

Of course we are far from complete freedom of change. Here are rules that we are implementing now:

- Write unit test BEFORE you write the class itself.
- Don't test manually. Always write automated functional tests (It is not that hard).
- When you find a bug that should be fixed don't expain how to reproduce it - write automated test that fails if bug is not fixed.
- Developer in turn should reproduce this bug in unit test.

Roman Eremin
Tuesday, December 04, 2001

Edvard Kardeli:
Thank you Simon. You’ve summed up into a single sentence exactly what we are trying to do. I shall be stealing that sentence in future next time I have to talk to a manager type!

The Invoice is in the post.

Simon Lucy
Tuesday, December 04, 2001

Thanks for your advice.

The problem is that we are an inhouse programming team.  Management don't want a product that is continually being updated.  They want an application that can be handed over to production and forgotten about.

I think this is part of the problem, as this seems an impossible goal to me.  At the moment they write an app, and leave it in use until it just can't cope any longer.  They they write a whole new app.

I think the way forward is continual improvement of the existing code base, just like on the mainframe.  The problem is that all the developers seem to prefer writing new apps.

Ged Byrne
Wednesday, December 05, 2001

This is a very interesting topic, So I have some personal experience with this in the past. Incremental is certainly the way to go.
But there are several ways of going about it that I have done or been involved in before. 1) You have a code that has been written that functions, it slow, it is hard to maintian, or its hard for someone to understand and add features to.  Years ago we had such an aninmal, where I took over the code to add some new features and well as fixing bugs. After spending sometime with it I found that these issues by that time I knew the had learned the code and a number of its problems. I spoke with the another good engineer and we basically moved into the same office and even the same desk. What we did was sat side by side for 6 weeks incrementally writing, testing and rewritting in parallel somewhere around 70K lines of code. Both of us had a very good understanding of the code. The first couple of days were spent just identifing the bottle necks and major problem areas. We prioritized  them and then would just sat there splitting  the problems between us and coding. Interfaces and parameters were agreed to on the fly! We would write new functions and then do a code walk through with each other. When we were were finished we put it back together and tested it together, then move to the next priority in the list.
When we finished we had dramitic improvements in performance, maintainability and had added  a number of new features.
2) I've also been involved in incremental rewrites where pieces were rewritten as sunk works!
3)I also have seen projects where a rewrite design document and the were in place for the project and seen them work. One of the fundamental points here were that the product should always be maintained as shipable and the new features and rewriten code were prioritized.

RFW
Wednesday, December 05, 2001

I think Joel's original statement was that there are some circumstances where a rewrite might be in order. He mentioned platform changes and architecture changes.

We, in fact, recently finished rewriting our company's flagship product. Had to be done. Platform change (sorta, the old product was written in Visual FoxPro, the new is all C++, and COM+) Also, major architecture changes. We went from using the FoxPro DB to using MS SQL Server and a distribute COM+ environment in an effort to allow the product to easily scale.

So, in this case, a rewrite was the only option. We weren't going to be able to port the FoxPro app to use COM+ and MS SQL Server. Not to mention, the old codebase (which I had nothing to do with), was spaghetti. REAL spaghetti.

Sometimes a rewrite is the best idea, despite what Joel said. I would rarely rewrite my own code or the code that my group writes, from scratch. The team I've put together is brand new and the old team is gone. The old code was a mess, the new code is structured and thought went into it before implementation instead of after it. I think a rewrite was the only option we had.

Pete Davs
Wednesday, December 05, 2001

Discussed rewrite/refactor with my colleague on the way home today. Me: ‘Do you think we should rewrite?’ Colleague: ‘Do you mean ‘do you think they’ll let us rewrite’?’ Me: ‘No. Do you think we should rewrite?’ Colleague: ‘Do you mean ‘Do you think we should rewrite whether they let us or not’?’ Me: ‘No. Do you think we should rewrite?’
And so on…

After administering a few slaps across the face we came up with a list of areas of the program that are less coupled to everything else that we could rewrite or refactor for the old program.

Some lumps we’ll rewrite – not much point in refactoring bad code to good code if the feature is all wrong and we then have to throw away the nice new refactored code to then make the feature right. Some lumps we can refactor first then enhance. In this way we can continue to surprise and delight our clients with our wonderful upgrades.

The software less all the lumps was surprisingly smaller than I expected. This is the core, without which there isn’t an app. We write a new core in parallel for which we have other uses.

The new stuff in the old program we write with well-defined interfaces so we can easily stick them on the new core. Then there’s some stuff to sprinkle over nearly everything that will have to wait till the end.

We have all the bugs we’ve ever fixed recorded in our database so that we don’t lose any fixes that are still relevant in the bits we rewrite. We already have a first draft design and a fairly detailed functional spec of the ‘rewrite’. So we have a rough idea of what we think it should probably look like when it’s all done – which is a bonus.

After first coming here I felt as if I was facing a fork in the road and didn’t know which way to turn. But thanks to everyone here, and especially Joel, I now see the way ahead.

It’s like making a trip across an ocean stopping off at various islands along the way to pick up more provisions and make repairs. It would take us longer than going straight across but our boat is too small to hold enough to see us through and too unreliable to make that distance.

I’ve got till the new-year to flesh out this plan.

Edvard Kardelj
Wednesday, December 05, 2001

*  Recent Topics

*  Fog Creek Home