Silly question about counting lines of code...
Does one typically count whitespace lines and lines of comments?
Not too silly... I was wondering the same thing myself a few days ago.
one way to do it is to count semicolons if applicable.
As long as the language doesn't support nested comments, it's pretty simple to use regular expressions to strip out comments and whitespace, then count the lines left.
I also should mention ... if you have a single function call or expression that spans multiple lines but you want to treat it as one line of code, it would be more appropriate to count line-terminating characters (semicolons, end-of-line characters, etc.) after stripping out the whitespace and comments. Naturally, you'd also have to account for the line-continuation characters in languages like VB.
Stripping out comments and whitespace is pretty easy, but is it common practice?
What is the purpose for doing the count? In most cases, it doesn't matter as long as all the code is consistent. For example, if you count the number of lines it takes to implement CRUD logic for a given database table, it should be a reasonable predictor of the number of lines for implementing CRUD logic on similar tables, provided you use consistent coding standards.
This is a pretty well-defined issue in the research. The usual definition of a "physical line of code", which is probable what you're counting, is a line of code ending in newline or end-of-file, and containing at least one statement termination. In other words, it has some non-comment and non-whitespace content.
BTW, to answer the followup question, SLOCCOUNT is a very good, crossplatform, crosslanguage tool that does PLOC counts as well as COCOMO cost estimations.
> What is the purpose for doing the count?
I created a line numbering program for visual basic out of necessity. It numbers the lines of code so I can use the line number to find errors. At any rate, whitespace, labels, comments, general declarations and various other language entities generate compile time errors if they are numbered in VB. It's a good way to tell how many actual lines of code I have written. The first version of the program, counted every line no matter what. I keep this version around to compare the number of lines in the files to the actual number of "numbered lines". The difference is around half. So if I had a 20,000 line VB program there would be approximately 10,000 extra lines. I don't know if this is universal or just attributed to my coding style.
I should say, If I had 20,000 numbered lines of code there would be 10,000 extra lines, so 30,000 total counting all the lines in the file. So take half of the numbered lines.
The purpose of counting the lines of code? Mostly out of curiosity. Now that the project is nearing completion I found myself wondering how much code I'd actually written.
What programming language did you use?
As some others have said, the answer depends on what you want the count for. You'd want to see separate counts of everything to get the most complete view. In terms of representing code complexity, the popular method seems to be to count things like control statements. I'd guess that typically when someone quotes a line count without further qualification, they mean all lines.
For what it's worth, the code complete guy doesn't include whitespace or comments.
Actually, I wrote my program in Whitespace ( http://compsoc.dur.ac.uk/whitespace/ ). Depending on the answer to this burning question, maybe I didn't write any lines of code at ALL! :-(
If anyone is interested in an addin that counts lines of code in VB as well as some other useful features (procedure/error handling templates, etc.) - MZTools @ http://www.mztools.com/index.htm
These sorts of issues are addressed with function points, which just for the record, are not functions/procedures, nor are they statements. They are units of functionality.
There's a far cry difference, I'd think in saying, "I have a 5,000 line Visual Basic application" and "I have a 30 line C# app."
Search Google (or your preferred search engine) for the Software Metrics Association in your country. They normally have a wealth of information on not just line counting, but many other aspects of software measurement.
Fog Creek Home