copy 'n' paste vs. reducing dependencies/coupling
Do you have any thoughts on this observation:
I have always been the kind of guy who eliminates redundancy in code. Whenever I find myself copying and pasting anything, I instead usually factor out what I copied into a function or maybe a "template method" design pattern. Even if it is something as small a function that you tend to call with certain parameters all the time, I'll wrap that up and make a meaningful name out of it.
But I have come to the conclusion that if you're modifying code that a lot of people call -- especially if it has a lot of state, or it is really messy code -- sometimes it's just better to copy and paste it. Sometimes you just need to get something done, and you don't want to bother to extract the common functionality out of whatever you need, and you just copy the whole thing and modify it. Since it's a separate copy you know you haven't accidentally broken other code (which you may or may not understand), so this can be a safer way of getting things done. It seems like after awhile low level code becomes stagnant because no one is around who understands it any more. People just hack on top of it because they don't want to change anything for fear of breaking things.
The same theme can be found in large companies with many codebases for similar products... I work at a game company, and pretty much every game duplicates huge amounts of code. We have several different vector math libraries that do pretty much the same thing. We have a bajillion string classes and memory allocators. Even internal tools have their own math libraries. Part of it is that we have acquired other companies, and inherited their code, and never bothered to merge codebases because it doesn't produce any benefit to the consumer.
But it is also fairly common where one team will branch off from a larger team, using that team's code base to create a new game. Theoretically the teams should keep in sync on their common code. But eventually it is too much of a pain in the ass, people don't care to communicate, don't care to integrate code that provides them with no tangible benefit -- and eventually the code diverges to the point that we can't easily share major upgrades. This doesn't happen over years, but over 6-9 months sometimes.
In Large Scale C++ Software Development, one of the principles (which you don't hear in other books) is that a little redundancy, carefully applied, can reduce dependencies. It seems that we have taken this to its logical conclusion and have a ton of redundancy and ship games very quickly. We don't have to support code after it is shipped because they are console games. It seems to go against most software development principles but it works (I work at EA, we make lots of money).
So any thoughts on this would be interesting; here are some questions:
1) Do you agree redundancy is good in these cases?
2) How do you tell when you have crossed the line? How do you balance the need to reduce dependencies and coupling vs. the need to have manageable-sized codebases?
3) What are better solutions besides copy and paste and branching codebases?
Saturday, February 21, 2004
What you're describing is famously known as the Lava Flow Antipattern.
The "redundancy" benefit you're describing is something of a red herring. If you have a utility function used in 27 places, and you're afraid that changing it will break those 27 places, it's not hard at all to find what those 27 places are and then retest them. Or fork the function itself, in place, and have two nearly-identical, cut and pasted versions of the function, next to each other, in the same file, so that later someone will say "why are there two versions of the same thing?" and figure out the correct way to merge them.
I think the best way to deal with Lava Flow is to figure out how much money and time you are willing to spend on "a better code base." Maybe between each major release you could give the whole team a month or two with no new features to write -- just code base improvements, refactoring, and cleaning things up. Pick the things which are most likely to improve the code base quality enough that there will be some return on investment in increased productivity for the things you improve. The article I wrote called "Rub-a-dub-dub" describes when I did that to FogBUGZ and we've been reaping the rewards of clean well-organized code ever since.
Fog Creek Software
Saturday, February 21, 2004
The rubadubdub article is here:
Thursday, March 11, 2004
I've had a similar situation over the last 8 years.
We wrote one pogram for commercial sale, preparing for a big user conference. We added a second program almost as an after thought. Well, the second one was the most popular.
Than we thought, reuse the code for a third program.
Fastworward 8 years and 18 more programs later. We've stretched that 2nd program about as far as it can go.
And it's in vb 3. I don't think a refactoring is appropriate in this case because there's no underlying architecture. (I've actually refactored bits of the program, but it's time to actually create an architecture which is extensible, because we've got another 10 programs we'd like to work on, but the difficulty in each addition has increased geometrically toward an infinity asymptote!.
Mr. Analogy (formerly The real Entrepreneur)
Sunday, March 14, 2004
Fog Creek Home