Fog Creek Software
Discussion Board




Welcome! and rules

Joel on Software

Best OO data practices for data validation

Say I have a class whose New constructor is overloaded a few times

Pseudo vb.net code:

New ()
New (value1 as string)
New (value1 as string, value2 as string)
New (Value1 as string, value2 as string, value3 as Integer)

Being an academic (one who reads lot’s of programming books, but doesn’t get his hands dirty too often) I want to use the best OO practices so I encapsulate

Private mValue1 as string
Private mValue2 as string
Private mValue3 as Integer

Public Property Value1 as String
Public Property Value2 as string
Public Property Value3 as Integer

Now being a defensive guy when it comes to programming (I hate doing tech support on systems I can’t access) I want to make sure that the values being passed in are good (let's say none of the strings can contain a “g” in them).

Problem: It seems to me that I have to write this same validation routine in many places – within each of the constructors and also in the set for the property.

What I see as the solution – write a validation routine for each of the member fields - e.g.

Private function ValidatemValue1 (value1 as string)
        If value1 does not contain “g” then
Return value1
Else
        Throw error (“Value1 must not contain g”)
End If
End Function

Then…. I would do the following for any assignment of value1 to mValue1

mValue1 = ValidatemValue1 (Value1)

Needless to say, I have never seen this in any of the myriad programming books I have read. It seems to me that they either:

a) don’t care about this & I am being too defensive

b) write the entire validation routine wherever they are setting the member field value – which seems to me to be very error prone and tedious

An I being a dork – am I worrying about nothing – is this something that I should just know?

I would greatly value your thoughts

i like vb.net
Friday, September 20, 2002

First off, buy the book "Design Patterns: Elements of Reusable Object-Oriented Software".  Read it, apply it, read it again, apply the patterns in different situations.

I think that you can solve your validation problem by having the member variables validate themselves.  Then in your validation function, you could just pass the member objects to the validation function and the validation function would call validate on the member object (this is just off the top of my head, I have not coded and tested this):

class CValidate
{
  virtual int Validate();
}

class CString : CValidate
{
  string m_sString;
  int Validate()
  { if (m_sString)
        return 1;
    else
        return 0;
  }
}

class CLong : CValidate
{
  long m_lNum;
  int Validate()
  {
    if (m_lNum > 0)
        return 1;
    else
        return 0;
  }
}

void Main()
{
  int iRet;
  sString = new CString();
  sString.m_sString = "Hello";

  iRet = sString.Validate();
}

Just my 2 cents

Tim
Friday, September 20, 2002

Tim,

Thanks for taking the time. I share your enthusiasm for patterns, yet my {} skills are non-existant so your example went right over my head.

Were you applying a specific pattern, if so which one?

Also, my original question was intended to get a sense from the esteemed members of this community as to the necessity of being so vigilant about data validation.

I could still use a hand here...

i like vb.net
Friday, September 20, 2002

The brackets just denote what scope the code is in:

if (m_sString)
{
  // do something here
}

is the same as

If m_sString Is Not Nothing Then
' Do something here
End If

brackets are just a way to make the compiler and coder know what scope the code is in.

Grab "Programming C#", by Liberty.  It is a good introduction to C# programming.  Also, you should pick up a good introduction to OO programming.

The point that I was trying to make in the example beforehand is to create a parent class that has a virtual function called Validate() then each class that you want to provide validation code for, you could then derive that class from the parent class and override the Validate() function to provide validation code specific to that class (In my example, both CString and CLong derived from CValidate, and they both implemented their own validation code).  If you want yet, more customization (e.g. CCapsString to represent strings that should be all uppercase), you can make the CString.Validate() function virtual, and derive CCapsString from CString and implement the CCapsString.Validate() function to validate whether the string contains all uppercase letters.  This is probably going over your head, too.  You need to do some reading and walk through some code to get a feel for all of this.

Tim
Friday, September 20, 2002

To make validation, I don't see why you would need complex classes and inheriting. All you need is a function like this. It's similar to the code you wrote, but I simply return True/False. It's better like this because 1) You don't copy the string data, so the operation is lighter 2) The code is simpler. You can have a set of such functions for each different validation type. (eg: one for Tel validation, one for Zip Code validationk one for email validation, etc.). You can also put those functions in a class (shared members), wich will be well encapsulated.

Private Function ValidatemValue(Value As String) As Boolean
    If Value = "g" Then
        Return True
    Else
        Return False
    End If
End Function

This is the way I see it

Etienne Charland
Friday, September 20, 2002

Just put the validation in the Property Set procedures:

Imports System.Text

Public Class Class1
    ' Constructors
    Public Sub New()

    End Sub

    Public Sub New(ByVal Value1 As String)
        Me.Value1 = Value1
    End Sub

    Public Sub New(ByVal Value1 As String, _
    ByVal Value2 As String)
        Me.Value1 = Value1
        Me.Value2 = Value2
    End Sub

    Public Sub New(ByVal Value1 As String, _
    ByVal Value2 As String, _
    ByVal value3 As Integer)
        Me.Value1 = Value1
        Me.Value2 = Value2
        Me.Value3 = value3
    End Sub

    ' Private storage
    Private mValue1 As String
    Private mValue2 As String
    Private mValue3 As Integer

    ' Public properties

    Public Property Value1() As String
        Get
            Value1 = mValue1
        End Get
        Set(ByVal Value As String)
            If Value.IndexOf("g") > 0 Then
                Dim e As Exception = _
                New Exception("Value1 must not contain g")
                Throw e
            Else
                mValue1 = Value
            End If
        End Set
    End Property

    Public Property Value2() As String
        Get
            Value2 = mValue2
        End Get
        Set(ByVal Value As String)
            If Value.IndexOf("g") > 0 Then
                Dim e As Exception = _
                New Exception("Value2 must not contain g")
                Throw e
            Else
                mValue2 = Value
            End If
        End Set
    End Property

    Public Property Value3() As Integer
        Get
            Value3 = mValue3
        End Get
        Set(ByVal Value As Integer)
            mValue3 = Value3
        End Set
    End Property

End Class

Mike Gunderloy
Saturday, September 21, 2002

Ettienne & Mike - thank you for providing someadditional suggestions. This discussion is really helping me a great deal as I get to see different approaches to what I feel is a common problem. I have some comments for each of you below, I look forward to hearing more from you or anyone else who would like to chime in -

Thanks again

Ettienne - thanks for the input - I agree with you on all points except using a boolean. While it may be more high-performance, I think it becomes a little more cumbersome to use.

I see your approach working as follows anytime I need to assign a value to a member field such as mValue1

if ValidatemValue(Value) = true then
mValue1 = Vales
Else
Dim e As Exception = _
New Exception("Value1 must not contain g")
Throw e
End if

This is far more typing than my original proposal which would look like

mValue1 = ValidatemValue1 (Value1)

But there may be some syntax I am not aware of that would allow for the use of your approach in manner similar to mine (that os - one simple line of code to make the assignment with all of the error handlind hidden away)


Mike - Thanks for fleshing out the throwing of the exceptions and adding in the Try/Catch blocks. Your code was is very nicely presented and easy to read.

Your approach is exactly how I started out  - I then started to think about a few other scenarios such as:

Public ReadOnly Property Value1 as String
- here there is no set routine

or

Private mValue4  as String
- here is a member field that has no property related to it at all. Do you suggest that I create a private property for this variable?

In short, are you saying that I should never actually explicitly interact with the member fields?

Thanks again to everyone - still looking for any other opinions and also feedback on how defensive you are when you code.

i like vb.net
Saturday, September 21, 2002

Hey I could have sworn there were try catch blocks in Mike's example - I will be implememnting try/catch in my final version.

While I am at it, the title of this thread was supposed to be
"Best OO practices for data validation" not "Best OO data practices for data validation"

Still looking for more suggestions.... thanks

i like vb.net
Sunday, September 22, 2002

Nope, I didn't bother with try/catch blocks...just bubbled up the validation errors to the caller. You might like to do something different, of course. And most likely you'll want to declare a custom Exception class rather than throwing the generic System.Exception.

As to your cases that break this pattern:

- In my own work, at least, this pattern covers most cases. So I use it when things are not at the borders. It gets easy to set up after the first few times (and of course there are plenty of code-templating tools that you can use to make it even easier).

- For the ReadOnly property, of course you don't need to worry about validation on the data after creation. So then it boils down to how to handle validation on the same parameter in multiple constructors. The same is true about the purely private variable, accessible only through the constructors. I think I probably would define a private property procedure for both of these cases, rather than using fields and manipulating them directly -- even if I did not have to do any validation in the first place. The Property procedures just give so much flexibility in the case of future requirements changes that I think they're worth having in there. Sure, there's some overhead, but I've not found it to be significant.

Mike Gunderloy
Sunday, September 22, 2002

I personally would avoid rejecting user input (and avoid error message boxes) as much as possible. The user is always right, even if he's wrong! Often, user data isn't wrong, just incomplete, ie. part of a social security number, (even if it is wrong according to strict validation rules).

It all depends on what kind of data you try to validate and what sort of application you are writing, and the interface of course.

One method is to make it impossible to enter faulty data - make a textbox that won't accept the letters that are forbidden.

When that is not possible or good enough, what about making IsCorrect-properties for each property that needs custom validation - Property string MyProperty1 would have a boolean IsCorrectMyProperty1 that would return true or false.

The data validation could then be called where ever and whenever you want to check if the data is correct or not, and you can indicate errors or incompleteness in the user interface.

Johan
Sunday, September 22, 2002

In the suggestion I made...

Private Function ValidateValue(Value As String) As Boolean
If Value = "g" Then
Return True
Else
Return False
End If
End Function

you wonder why I return a boolean instead of a string? By returning a boolean, you can use this procedure multiple ways. All what it does is tell if it's valid or not. Then, the caller do what he wants. You can copy data if it's valid, set the value to the default value is it's invalid, display a message box if it's invalid, etc.

Here is another way you can do interface validation. You can create a class that inherits from TextBox that has a property called "Validation". You could put a validation mecanism for strings, numbers, currency, float values, email addresses or anything you want. Then, all you have to do to validate your data on a form is to put your control on the form and set it's validation property! Nice, no?

Personally, I don't use this technique for TextBoxes, but I do have a class that inherits from the ComboBox that makes sure the text entered is in the list (and it also has auto-complete). For validation in TextBoxes, I call a method in the form load event. When called, the method attaches itself to the validating event of the control. (it also formats the text with a specified format)

In fact, validation is used in so many cases, and each of them can be implemented in so many ways...

Etienne Charland
Sunday, September 22, 2002

Thanks to everyone for their contributions -  I really appreciate the time and effort. Time for me to stop talking and start implementing.

i like vb.net
Monday, September 23, 2002

Just wondering about the performance implications of throwing custom errors. From a code readability / maintenance standpoint it is obviously the best approach, but if performance were a concern would you be better to simply return a boolean value indicating whether the validation was successful or not? My understanding is that throwing errors (and the associated bubbling of events) can actually result in poor performing applictions.

Any feedback?

Thanks

Jason

Jason
Tuesday, September 24, 2002

I believe that throwing exceptions costs more in processor cycles, but so what?  If you are validating user input, performance isn't a major concern as a user can't tell ten clock cycles from 100 cycles on modern computers.  It is a different story if your are validating 1000's of records via an import interface.

Bored Coder
Thursday, September 26, 2002

One quick thought...if you're developing a web application you can do all the client-side validation you want.  But for God's sake PLEASE validate on the server as well.  It is trivially easy for an attacker to bypass client-side checks. This is one reason why attacks like sql injection, cookie poisoning, and cross-site scripting exist.

Jeff Williams
Saturday, September 28, 2002

Jeff,

I hear you load and clear - When I use other people's components, I don't like suprises, so why should I unleash suprises on someone else ;)

The code I am writing here is really just a wrapper for a very tempermental EXE. The 1st pass at this took about 30 minutes, However, one false move would hang the process. So, having no way of knowing what the next developer will do with it, I buckled down and started building this out as nicely as I could. 

I have taken great care with my component to validate all data (i.e stuff that is passed in as well as  stuff that is the result of - say - concatenating two different inputs. I have both Winform and ASP.NET based test implementations.

When validation doesn't work out - it sends out an informative error message - including the steps taken to leave the object in a valid state to the calling routine. Finally, if it is switched on in the config file, write out all errors to a log file.

So far so good.

i like vb.net
Sunday, September 29, 2002

*  Recent Topics

*  Fog Creek Home