Fog Creek Software
Discussion Board




Truths about programming..that make you go "hmm.."

Hi,

I read a comment by someone on JoS that struck a cord:


TRUTH: "Software code is READ many more times than it's WRITTEN"


This is true, but I hadn't ever really thought about it. And it has significant implications for best practices.

IMPLICATION:  The author was emphasizing the need for well formatted code with proper indenting, etc.



{I know, this isn't UNIVERSALLY true. E.g, not true for throw away code, protypes, etc.  But I think it's GENERALLY true in most long term successful projects (a program that will be used fairly widely for 3 or more years)  }

MANAGERS & CLIENTS *SHOULD* KNOW THIS
I occurred to me that there are a lot of programming "factoids"  we know instinctively but perhaps never really have thought about. These are also, I think, things that non-programmer managers (and clients) do NOT know but SHOULD know to understand) 

Example:

TRUTH: 1.  The difficulty with updating software isn't so much the actual code you have to write,  the difficulty is in the RETESTING of the software to make sure what you updated doesn't have unintended side effects.

IMPLICATIONS 

1A -it's easy to make changes BEFORE you finish the testing.
1B  -it's best to BATCH your changes so that you can make many changes and do ONE regression test.


DISCLAIMER
I'm talking about things that are GENERALLY true.  There are always exceptions, but if something is true in most typical projects, then we can call it a TRUTH.


Any other TRUTHS anyone can think of?

Entrepreneur
Thursday, July 10, 2003

One should refactor and THEN update code, never both at the same time and never the latter before the former.

Often times a much simpler solution to the update presents itself after a bit of refactoring, and doing both at the same time is often the source of errors.

Lou
Thursday, July 10, 2003

If you batch your changes, you weaken your ability to localize any resulting problems.  You also weaken your ability to exploit concurrency by making testable modifications wait until all modifications are complete.

anon
Thursday, July 10, 2003

Anon,

Can you elaborate on those two points?

In the first, are you saying that if you make several changes at one time that , then, if you get an error that you won't  know which change caused the error?

Remember, I'm talking about a REGRESSION TEST here not a Unit test. (Obviously you'd Unit test as you finished making mods to each unit). 

So, according to the above logic, wouldn't it make sense to test after every line of code, so that you'd know which line of code was causing the error?


IMHO- batching your changes and then doing a test saves you tons of time which you can spend troubleshooting IF you get an error.  Narrowing down the possible causes of an error isn't really that difficult if you know what the possible causes are, you just use the process of elimination.



If the above isn't clear, I elaborated ad naseum below:

If so, then I think that ignores the point that TESTING takes a lot longer than CODING.  I.e., I might spend 10 minutes coding a change (and, of course, doing a unit test). But then I might spend another 20 minutes doing a regression test of the whole program.

Also, the assumption is that the regression test is just to make sure that there are not any errors.  You'd unit test first to make sure your new code was good.

So, if I have 10 changes, each taking 10 minutes, I can do them all at one plus a 20 minute test (total time: 120 minutes). 

If I do a test after each change, then it's (10+20) * 10=300 minutes, or almost 3 times as long.

Entrepreneur
Thursday, July 10, 2003

I second the "batching changes is bad" notion. If you make a bunch of updates at once and it breaks the build so that your app crashes at startup... where do you start?

Now, if you make one update, rebuild, and the crash happens, where do you start?

See?

Debugging is more of an AI task than pure computational task. Computers are better at making computation efficient, and so they do things like batching... but you are not a computer and this is not really a line-by-line computational task.

Dan J
Thursday, July 10, 2003

anon:
Add one more thing:
Check in your changes often.

That way if a regression test turns up a failure, you can go back through your builds (or generate intermediate builds) and do a binary search to determine which change caused the problem.

mb
Thursday, July 10, 2003

Entrepreneur,

I guess what I should have said was that "there are times when using many small batches of changes are better than using one large batch."  The rate of the system is governed by the rate of the slowest process in the system.  If testing takes three times as long as coding, then you should code three changes and submit them for testing.  If testing takes half as long as coding, you should submit them one at a time.  All assuming that the testing resource is not the same as the coding resource.  In other words, the best solution depends on the properties of the problem.

As to the regression test, I am assuming you mean running tests to check for error conditions identified during previous rounds of testing that have since been eliminated.  (I'm not a Q/A guru, so I might be misunderstanding this)  The thing is if you introduce many changes, each could be the source of whatever failure is identified.  If you introduce them one at a time, the locus of the problem is much reduced.

Sounds like MB has an excellent strategy for reducing the localization effort, however, even for very large batches.  Thanks for the great tip!

anon
Thursday, July 10, 2003

I agree with the idea of frequent checkins to source control.

I also agree that the relative difficulty between coding and testing influences how big a batch to do at once.

But remember, regression testing can be very time consuming. My estimate of 20 minutes was very very conservative.

CAVEATS

MY goal is to reduce overall effort, not compress the schedule.  I realize that schedule is sometimes the constraint.  I'm not concerned with that here.

Lets define something here.  I'm going to talk about

BUILDS. A build is something less than a few hours of work (certainly less than a day, maybe 30 to 150 lines of code) and probably only affects one module. You'd do a unit test for a build and check it into source control.

If you batch your builds, you'll save time on testing.

DAN ASKED:
" If you make a bunch of updates at once and it breaks the build so that your app crashes at startup... where do you start?
"

Dan,

If you're a good troubleshooter you can usually quickly narrow down the cause to a module (process of elimination). You could then zero in on the Builds for that module.

WORST case scenario is that you'd start with the first build.  Then you'd be right back where you'd be if you were testing at each build, except that you'd have ALREADY done the test.

BUT, the BEST case scenario is that you do NOT have errors and you saved all that testing. WORST case scenario is that you'd have just as much work as if you tested each build.  The median case is that you'd maybe have a bug but be able to isolate it and mabye have to look at 10% or 20% of your builds.

Entrepreneur
Thursday, July 10, 2003

If the cost of testing in your particular case is high (for wathever reason), then a "binary search" kind of approach to testing might be the thing.
Batch all your changes: if it gets through the tests, your home. If not divide batch in half and test ...

Just me (Sir to you)
Friday, July 11, 2003

If you have a QA department, and the regression test is automated (as it should be) the amount of effort it takes to do a regression should not be of concern until you've reached the end of your development cycle, at which point, you shouldn't be making changes anyway. The only changes that should be "batched" are those that are made in a single day, since you should be providing QA with daily builds, which should also be an automated process. QA departments hate getting builds with lots of changes.

MarkF
Friday, July 11, 2003

>>TRUTH: "Software code is READ many more times than it's WRITTEN"

... and this is a truth because all other people usually write many more times than they read it

19th floor
Friday, July 11, 2003

*  Recent Topics

*  Fog Creek Home