Fog Creek Software
Discussion Board




When did booleans burn you?

Lighter topic!

I first got burned by booleans on an employment application processing app - several questions had yes/no checkboxes, so I implemented them with booleans.

Made it to beta before someone pointed out that "did not answer" and "unreadable" also needed to be options. So I had to rip out the plumbing and rework it.

I'm now very judicious with booleans (I generally avoid them) and *never* use them for user-interface data.

Philo

Philip Janus
Tuesday, March 04, 2003

Yeah I always ask myself of the requirement,

is there any chance the client will request
more options?
is the nature of this question so simple
that it can't possibility be asked in any
other way?
When I feel even the slightest doubt about
it I generally assign a value type that's a
reasonable over-kill--like

        tinyint in a database server like SQL
Server (1 byte, 128 value positive and
negative plus accounting for NULL values)
        or short int or byte (8 bits or 16bits
depending on the 32-bit programming language
you are using)
It's not an exact waste of space, unless
your program domain is on the scale of
millions of records or more.
If you start having problem you can always
encode lots of bit-wise answers into a
bitfield.
Waste a little, you deserve it :D

-- David

Li-fan Chen
Tuesday, March 04, 2003

In software engineering text you'll find
some mentioning of aliasing the type you
are using. Instead of using a type like
boolean and values like TRUE and FALSE
you go one level of indirection and pull a
structure or class.. or at the very least
create a simple type that points to some
other type.
So your type of boolean could point to a bit or a byte or whatever
And your own version of TRUE and FALSE could point to true and false, 1, 0, YES, NO , doesn't matter

Li-fan Chen
Tuesday, March 04, 2003

I would think a UI with two checkboxes "yes" and "no" should have been designed with radio buttons.  It guarantees an answer as one of them is always a default.

If there are other options ("not sure", "n/a") then they should also be displayed as most users get confused when  the UI options don't represent their possible answer.

sedwo
Tuesday, March 04, 2003

Another thing I run into is the problem of using arrays when later on I get caught having to change the data structure to a hash tree or worst--a tree).
Again, that's why modular programming really helps and a few helper classes/modules/member functions really smooth things out.

Li-fan Chen
Tuesday, March 04, 2003

sedwo you are one sharp cookie *grin*

Li-fan Chen
Tuesday, March 04, 2003

If you look closely at the Windows API the "BOOLEAN" type is implemented as 16 bit integer, and in some cases it actually distinguishes between three states (1, 0, -1), for example, in a tri-state checkbox which may be checked, unchecked, or "indeterminate" (for example, the state of the bold checkbox when some text is bold and some isn't)

Joel Spolsky
Tuesday, March 04, 2003

When they wrote the monstrosity that is our college database system it slipped through the specifications that a student could be marked late, as well as absent or present.

Off they go and make the field boolean. They were not very pleased, and spent a year trying to persuade us we should change the college regulations to fiit the table structure, when they found out about the lates, and the fact that they had to make three lates equal one absence.

Stephen Jones
Tuesday, March 04, 2003

Should be "When did the lack of requirements burn you?"

This has happened countless times on the product my company produces.  And is a nightmare to remedy.  Especailly if it has been in the field for a while.

Try migrating boolean to character data leaving the customer to decide if a false is actually NO or UKNOWN (Y/N/U)...


The data type doesn't burn you, its the requirements or lack of that does...

apw
Tuesday, March 04, 2003

apw,

I understand where you're coming from, but it *is* the datatype that burns you. Requirements are always gonna change. It's a given. Accept this nuisance as a fact and life will be better.

Just like Philo, I've been burned on this so about the only place I use booleans is on return values for functions. I never use them on UI elements because eventually, someone is gonna want to add another field and that efficient lil' boolean value will give you a huge case of heartburn.

Go Linux Go!
Tuesday, March 04, 2003

I find myself using maximum-length text fields instead of memo fields, because I know that eventually some manager somewhere will want to search the comments, or even sort by them. The only way to convince them that there's no useful data "hidden" in the comments is to let them look. (The side benefit is that users learn to be concise. At least that's what my inner optimist says.)

I've never been burned by a boolean, but that may be because I've instinctively used them only for 'flag' type values. For example, on a list of employees, "CurrentlyEmployed" would be a boolean; either the guy still works here, or he doesn't.  On the other hand, I wouldn't use a boolean to indicate whether the person is a temp or permanent, because eventually some supervisor somewhere *will* decide that a "contract" employee is different than a "temp" employee. Or something.

Martha
Tuesday, March 04, 2003

All the "problems" with booleans listed so far have been because the boolean was misused.  A boolean is a logical indicator, it is NOT a two value enumeration.  Whatever data size the boolean implemention has is irrelevent--whether it uses 1 bit or 32, a boolean only has two values.  But that doesn't mean all two value indicators are booleans.

Avoiding booleans because you misused them is ridiculous.

Vulcannis
Tuesday, March 04, 2003

In my experience, there are only two types of fields in the average database:
1. Fields whose specification is too restrictive. These cause problems for users because they have to decide how to work around the lack of flexibility.

2. Fields whose specification is too loose. These cause problems because trying to extract information out of the database becomes an exercise in second-guessing the creativity of the people doing the data entry.

By definition, a Boolean value can never provide more than the absolute minimum amount of information (one bit). Therefore, ALL boolean values in databases are over-specified, and should probably be replaced with at least a three-value field (YES, NO, not sure).

In the real world of paper forms, a common idiom is the multiple-choice question with catch-all, for example:

How did you find out about Joel on Software?
A. Web search
B. From a friend
C. Slashdot
D. Other: _________________

I don't see this as much as I'd like to in database applications. There is at least one internal database where I work which has a feature whereby user-defined keywords which can be attached to any record.

This is potentially a nightmare of one-off values, but in practice, the currently-defined keywords are displayed in a list on the record entry form. It's very easy to pick out the right keyword, and if you really need a new unique value, you can easily create a new one.

This feature ends up being used quite often as a way to try out new organization and reporting techniques without having to redesign the database. If some keywords become part of the process, they can be promoted to "real" fields of their own later.

Mark Bessey
Tuesday, March 04, 2003

Vulcannis,

I think you missed Philo's point. The original requirements were for a field to be "yes/no". That's equivalent to true false. It's not a two field enumeration. Using a boolean in that case makes perfect sense. It's not misusing booleans.

The problem happens when someone decides to change the underlying question behind the field.  So rather than asking a "yes/no" question, the question is changed to something where an enumeration is required. Now that boolean field no longer works.

From what Philo said, there was nothing wrong with his design based on the initial requirements. The only problem is that requirements change and using a boolean field to store user input, no matter how proper at first, is destined to fail once the requirements change.

Go Linux Go!
Tuesday, March 04, 2003

I've often been burned by Booleans that were implemented inconsistenly in the code.  I'm sure we've all seen code in which 0 equals true for one variable, but 0 equals false for another.

Some of it was my own code....

Brent P. Newhall
Tuesday, March 04, 2003

I am amazed that anyone would ever use booleans for UI elements or database fields.  I don't think I've ever seen any "real-world" type is that is strictly boolean.  The only time I ever use booleans is internally within an application because in my experience, binary computing is the only thing with boolean semantics.

John CJ
Tuesday, March 04, 2003

[I think you missed Philo's point. The original requirements were for a field to be "yes/no". That's equivalent to true false. It's not a two field enumeration. Using a boolean in that case makes perfect sense. It's not misusing booleans]

Actually you can read it as is he had two separate check boxes, one for "yes" and one for "no" so the data type was simply misused. And even if it was one checkbox a boolean would be the correct data type to store the information.

Data types cannot be wrong, they are simply tools used by the developer. If your basis for a good data type is when requirements change the type can handle it you will be let down. If you can explain how a data type should self-adjust for changing requirements I am all ears.

Ian Stallings
Tuesday, March 04, 2003

Ian,

I'm not sure what you mean by "If your basis for a good data type is when requirements change the type can handle it you will be let down"

It is possible to anticipate requirements evolving and possibly using a datatype that stores a larger set of values. Instead of using a byte field, go ahead and use an integer. Maybe the customer swears their SKU numbers are always numeric, but you figure you'll make the database field a varchar just in case they decided to start adding alphanumerics to their SKU's.  Even if the customer says a field is always yes/no, why not use an integer field instead of a boolean... Yada..Yada..Yada..

I don't think anyone said anything about making datatypes evolve. I believe what has been discussed is "defensive programming" techniques.

Go Linux Go!
Tuesday, March 04, 2003

To answer some of the underlying questions...
1) The user interface was a paper form. I haven't found a good implementation of radio buttons on a paper form yet... (and it was an established paper form - I had no control over its creation)

2) Regarding requirements, it was one of my first applications. As I indicated in the initial post, I *have* learned from the experience. [grin]

Philo

Philip Janus
Tuesday, March 04, 2003

The underlying problem is a mechanism/policy problem. You should use the infrastructure of your application to provide a mechanism, and the upper layers to set policy.

For a database/forms-driven application, the DBMS is the infrastructure and should do no more than provide a mechanism. Take a cue from K&R and provide the most flexible mechanism possible (within reasonable constraints, which will vary from situation to situation). Then implement policy in the upper layers, like the UI.

Thus the original poster's boolean problem could have been avoided by using an enumeration or text string to store the value, and setting the "boolean-like" policy in the UI (with e.g. radio buttons). Then, when you find out you need more than two values, it's a simple matter of changing the policy in the UI. The infrastructure can implement the new policy with no changes.

Of course, how it actually shakes out will depend on your situation.

Chris Palmer
Tuesday, March 04, 2003

Booleans are fine as "logical indicators" and as return values for functions, but they are best avoided as function arguments. Boolean arguments are not self-documenting in function calls. For example:

void printEmployees(bool onlyManagers);
...
printUsers(false); // what the hell?

versus:

enum { utAllEmployees, utManagers } UserType;
void printEmployees(UserType type);
...
printUsers(utAllEmployees); // aha!

The problem with boolean arguments is that they seem reasonable when you define functions, because you see the identifiers. They are confusing though when you call a function. Morale: Use enumerations instead of booleans for flags and options.

Frederik Slijkerman
Tuesday, March 04, 2003

The printUsers() calls should be printEmployees(), of course.

Frederik Slijkerman
Tuesday, March 04, 2003

Letting your data handle incomplete data is an example of fudging - it can be great if your data is well specified, but I wouldn't want to force specifications onto my users.
Polite software allows fudging: http://archive.devx.com/upload/free/features/getstarted/2000/sp00/acsp00/acsp00.asp
Make your code generic: http://c2.com/cgi/wiki?ZeroOneInfinityRule

Could someone explain why I shouldn't use a boolean to represent the value of a checkbox? It's either ticked or unticked. It would take a scarily flexible solution to cope with someone scribbling over the form and writing "I plead the 5th amendment on this question!" in the margin.

quey
Wednesday, March 05, 2003

quey: The whole point was that sometimes a checkbox becomes inadequate, and if the code underlying it is overspecified (i.e., a boolean), you won't be able to extend the software to meet the new requirements as easily.

Chris Palmer
Wednesday, March 05, 2003

Quey - as an example, your requirement indicates allowing the user to answer "Have you ever used tobacco?"

So you figure "checkbox, right? Yes/no - boolean"

You implement it, build the app, and the day of release someone points out that
a) The policy is for the user to affirmatively answer the question - they must *do* something to select an answer, so there cannot be a "default" answer other than "not answered"
b) Privacy requirements indicate that the user doesn't have to answer the question, but not answering a checkbox indicates the answer "no"

So you have to implement it as a pair of radio buttons without a default selected. You now have three states - Yes, No, and unanswered. And if you've coded your app with a boolean in the database, you have to replace it with an int, rewrite your stored procedures, business logic, etc, etc...

Philo

Philo
Wednesday, March 05, 2003

There's a difference between a printed form, and encoding that, and a form on a screen, the processing of which you can control.

You can't control how someone is going to fill out a printed form, apart from sanctions like 'we will take your first born'.

So, creating the input form from that kind of data always has to cope with that problem.  There's an argument that you can still use a flag for the actual data but that you have a set of aspect fields which show the quality of that data.  That's especially true when you can't expect the person inputting the data to make the quality decision on the data.

Simon Lucy
Wednesday, March 05, 2003

While I'm not disagreeing with the principle that incautious use of booleans can come back and bite you, many of the "examples" given here seem flawed to me. 

The argument is that if you store a yes/no question as a boolean, when the spec changes to include unanswered (for example), that data type must change.  Problem is, in many databases, a bit datatype already accepts three values by default - 1, 0, NULL.  Changing the form to meet the new requirement could be as easy as ensuring that the column default is NULL and not setting the radio button (to use the example implementation given in an above thread) to have a default.

The example given near the top of this thread with four options was much better.  However, it seems to me that the responses all focus on how "booleans burn you" because the data type is limited to two or three responses.  In my experience (maybe what I do is unusual, but...), changing the datatype in the database is much easier than changing the user interface.

If the requirements change such that a yes/no question becomes a yes/no/maybe/I don't know/Unanswered question, if the extra options weren't anticipated, taking a minute to change the database from a bit to something that can hold a limitless number of options isn't going to be the part that burns.

My 2c

Phibian
Wednesday, March 05, 2003

Two of the things I like most about C# are t/f is NOT the same as 1/0, and Enums. That really takes care of a lot of these problems.

But getting burnt by changing specs doesn't stop at bool - say you decided to answer 'do you smoke' with a int instead of a bool - then you can be sure the requirement will become a user comment and you will still be mucking about.

Robin Debreuil
Thursday, March 06, 2003

The problem seems to be that the use of booleans to represent values that are not "true/false", but instead just accidentally happen to have two possible answers.

In that case, the right approach is to use an enum instead of a bool. That way, if your solution space suddenly grows an extra option or two, you just expand the enum.

Of course, this isn't so easy in a DB - unless SQL supports enumerated types (I don't know a whole lot about SQL).

Chris Tavares
Thursday, March 06, 2003

*  Recent Topics

*  Fog Creek Home