Fog Creek Software
Discussion Board




Why Do We Need Generics?

Okay, I'm stupid...  But, in just a few words, why do we need generics?  Can't seem to find a succinct answer to this question anywhere.

anon
Monday, May 24, 2004

Because brand name prescription medication is just too expensive.

Alyosha`
Monday, May 24, 2004

I'm afraid the only succinct answers you'll get are "We don't", and "It makes stuff clearer".  Like just about any other mechanism for code clarity (*), it's not required, but it can be very useful.

(*)  Note: since you used the term "generics", and since this was immediately after a java-centric thread, I'm assuming you're talking about the java generics-style construct, where the generics don't give anything like specialization (where a List of bits can be implemented differently from a List of Invoices).  This is why the only thing it really adds is code clarity.

Generally, it takes a class of defects, where a programmer takes a List of Foos and mistakenly treats it as a List of Bars, and moves the detection of this error from runtime to compile time.  (Or, with a good editor, to type-time.)  It's basically just a documentation construct that the compiler (and editor?) understand and enforce.

When people talk about Lists of Foos and Lists of Bars, it doesn't seem entirely useful, but when you get into more-complex object hierarchies and transformations, built mostly with reusable components, they're very useful.  When used well, they have the ability to give you most of the benefits of code reuse (reusing general components and containers) with most of the benefits of non-reuse (components and containers specific to your objects).

schmoe
Monday, May 24, 2004

The question really isn't "Why do I need them?", since you can get by without them.

The better question is "Why should I want them?"

You can find the answer to that in lots of spots all over the web.  One is here:
http://www.eu.microsoft.com/Belux/fr/msdn/community/columns/wuyttersprot/generics.mspx

Herbert Sitz
Monday, May 24, 2004

Without generics you may have weak typing and therefore have run-time errors instead of compile-time errors:

List cats = new List(); //list of cats
Cat cat1 = new Cat("fluffy");
Cat cat2 = new Cat("tabby");
Dog dog = new Dog("fido");
//add cats to list
cats.Add(cat1);
cats.Add(cat2);
cats.Add(dog); //oops: no compile-time error
//process all the cats in the list
foreach (object o in cats)
{
  //item in list is a cat
  Cat cat = (Cat) o; //oops: run-time exception when Dog item is dequeued
  cat.Miaow();
}

To get strong typing here without generics, I need to define a ListOfCats class:

class ListOfCats {
  void Add(Cat cat); //compile-time error if someone attempts to add a Dog
  //+ other methods e.g. to iterate ...
}

Doing that means I should define another class:

class ListOfDogs {
  void Add(Dog dog);
  //+ other methods ...
}

Note that I'm copying-and-pasting ListOfCats, to write ListOfDogs with a different parameter data type.

Generics means that I can write a single class, like TypedList, and instantiate it for a specific data type, e.g.:

TypedList<Cat> cats = new TypedList<Cat>(); //list of cats
...
cats.Add(cat2); //ok
cats.Add(dog); //compile-time error! better than a run-time error ...

Generics are also useful, in languages where all data types don't derive from a common base Object class.

Christopher Wells
Monday, May 24, 2004

We don't *need* them, but using them can make our code simpler (since we won't need to roll our own strongly-typed collection classes) and safer (since we won't be as tempted into just using generic ArrayLists, with the lack of type safety.)  There's a good discussion of this in the new C# 2.0 Specification:

http://msdn.microsoft.com/vcsharp/team/language/default.aspx

Also, FWIW, C# 2.0 will also support nullable types (like a nullable DateTime type.)  The present lack of support for nullable types was a point of contention here a few months back.

Robert Jacobson
Monday, May 24, 2004

Of course cats and dogs work well until you find a minah bird or a parrot that can miaow. Do you really want a list of cats, or things that can miaow? The correct answer is, of course, that you should re-implement the whole thing with a Python.

David Roper
Monday, May 24, 2004

This is a really difficult question.  I personally think templates are a mess in C++ and are causing the demise of the language.  In C++ there is huge feature that has been missing since the advent of templates, namely concept checking. 

Concept checking is a way to make sure that a type passed as a template parameter meets the requirements needed to use it.  Without this we end up with horrendous error messages in the template instantiation code.  It forces the user to go looking through the implementation.  This is the worst kind of data hiding possible, and is exactly the opposite of the desired effect.  I feel the C++ community has decided to ignore these serious limitations, because Templates are cool. 

In my opinion behavioral polymorphism based on templates (ie Policies) is terrible in practice.  In fact the same thing can be achieved with the Strategy pattern much more cleanly. 

I have an article in the queue as CUJ on this topic.  Hopefully it will make to publication as it is controversial, considering the boost guys seem to have all the thunder these days. 

christopher baus (www.baus.net)
Monday, May 24, 2004

schmoe-  Very well put.

Caffeinated
Monday, May 24, 2004

Christopher, "concept checking" as you call it would be something the compilers implement to give better diagnostic error messages during compilation, right?  Not something that would be in the C++ language itself?

(Just making sure I understand the concept, it is an interesting one...)

Michael Kale
Monday, May 24, 2004

But...But...to play devil's advocate, isn't that what interfaces and polymorphic behaviour is for?  Instead of Dogs and Cats, they can all implement Animal, and when you call getType() or whatnot, it would call the actual underlying method of its actual type.  It seems to me like this is a perfectly acceptable solution.

vince
Monday, May 24, 2004

Take the following simple class:

template <class T>
class foo{

public:
  void doit(){
    myT.bar();
  }

private:
  T myT;
};

In the example T has the implicit concept that requires function bar();

The problem is the error occurs in doIt() when the template is instantiated, not when class passed as T is defined. 

This exposes the implementation details of foo to the user, and while this example it doesn't seem like a significant problem, it get to be a serious problem when the constructs get more complicated (ie aggregating template classes).  I have wasted hours of valuable development time wading through errors caused by this problem.  This is the only time in my career where I spent more time fixing compile time errors, than run time. 

In my opinion the lack of concept checking is reason enough to avoid compile time behavioral polymorphism in large production C++ systems.  It certainly almost always overrides the said performance advantages. 

If generics are added to Java or C# then I think it would best for the language designers to learn from this mistake in C++.  Since C++ must now be backward compatable with existing code, it will likely never be fixed. 

Many influential members of the C++ community have decided to ignore the problems with templates and push ahead with their pervasive deployment anyway.  They then go on to imply that those of us who question their applicability, are not up with the latest C++ techniques.  I think it is time that some to show how these techniques are failing in production, and that this is polarizing the community as a whole.   

christopher baus (www.baus.net)
Monday, May 24, 2004

Vince, polymorphic behaviour is to let you call animal.move() without knowing what type of animal it is; doing ((Cat) animal).miaow() is an "upcast" and is frowned upon.

Also, I shouldn't need to do "if "animal.GetType() == typeof(Cat)) do_what_i_wanted() else who_knows_what_to_do_now()" if I *know* that the animal in question is a cat (which I do know if the collection is of type list<Cat>, because I know that this list's Add method only accepts objects of type Cat).

Furthermore, in C++ I can have list<char> or list<double> ... fundamental "value" types which don't derive from any common base class (and, on instances of which you can't do run-time type-checking).

> In C++ there is huge feature that has been missing since the advent of templates, namely concept checking. 

Isn't this just a compiler implementation detail, not a language problem? You simply want the compiler error to reference the line of code that is causing the template instantiation[s].

Christopher Wells
Monday, May 24, 2004

> Isn't this just a compiler implementation detail, not a language problem? You simply want the compiler error to reference the line of code that is causing the template instantiation[s].

No it is not.

christopher baus (www.baus.net)
Monday, May 24, 2004

The problem is really one of communication.  Using abstract base classes, the types needed are explict and required by the language.

Using templates they are implicit, required by the implementation of a templatized class.  Changing a concept often happens with out the implementor of a templatized class realizing it.  This is bad. 

The implementor should have to specify the concept, and when he or she breaks the concept he should be notified right then, not when a user passes a type via a template parameter to the class.

Some consider lazy template instanciation to be a feature in C++, I consider it a serious liability. 

christopher baus (www.baus.net)
Monday, May 24, 2004

> The implementor should have to specify the concept, and when he or she breaks the concept he should be notified right then, not when a user passes a type via a template parameter to the class.

Given your example foo<T>::doit(), is it the implementor of the template or the implementor of the "T" type who should specify the concept? And, what syntax might you use for example, to specify the concept: apart from declaring bar() in T, and using bar() in foo?

Christopher Wells
Monday, May 24, 2004

The implementor of foo should provide the concept.

christopher baus (www.baus.net)
Monday, May 24, 2004

I don't know what extra syntax you'd suggest, to specify the concept; if by 'concept' you mean 'signature that this template requires of T', well if T is for example a collection, e.g. a list<>, then it has dozens of associated concepts.

> I have wasted hours of valuable development time wading through errors caused by this problem.  This is the only time in my career where I spent more time fixing compile time errors, than run time. In my opinion the lack of concept checking is reason enough to avoid compile time behavioral polymorphism in large production C++ systems.  It certainly almost always overrides the said performance advantages.

Well, Stroustrup is clear about the design intent, which is that using templates should incur no *run-time performance* penalty at all; which rules out doing what C# does, e.g. making collection derive from IEnumerable and making their methods virtual functions.

Christopher Wells
Monday, May 24, 2004

When I said performance advantages I meant policy vs strategy.  Policy is said to be faster than strategy because it doesn't use virtual functions.  In my experience the advantage almost never matters.

If list has many associated requirements, so be it.  I still think the concept should be fully specified. In fact, that's exactly the problem.  As a user of list there isn't a clear place to go to find the requirements of T.  This is a significant oversight in template language in my opinion. 

I think it would be a shame if generics were added to either Java or C# without learning this lesson and making concept specification and checking a requirement on all parameterized types.

christopher baus (www.baus.net)
Tuesday, May 25, 2004

High Five, Alyosha! That was great!

Dennis Atkins
Tuesday, May 25, 2004

Is there some reason why in the above code you wouldn't just do this:

foreach (Cat c in cats)
{
  Cat cat = (Cat) c; //no run-time exception when Dog item is dequeued because dog is not a cat and s hould not be in the cats to begin with and why so much casting casting casting everywhere
  cat.Miaow();
}

Wondering
Tuesday, May 25, 2004

>If list has many associated requirements, so be it.  I still think the concept should be fully specified. In fact, that's exactly the problem.  As a user of list there isn't a clear place to go to find the requirements of T.  This is a significant oversight in template language in my opinion.

I understand what you're saying about having the compiler do concept checking, but I must say the lack of it doesn't have to make life as hard as you suggest.  If you're using the STL, there /is/ a clear place to go to find the requirements of a T - it's "Generic Programming and the STL" by Matthew H. Austern.  It explicitly states the requirements of T for all the standard library containers and algorithms.

If you're writing a template class or function yourself, just include a comment up the top that states T's requirements, and make sure you keep it up to date ;-)

Tim Serong
Tuesday, May 25, 2004

That aside - after five minutes of reflection - I can't see why you *couldn't* add concept checking to the language.  It's not going to be as simple as I'm about to make out, but how about something like this:

// classes that have this signature
// are models of "Assignable"
concept Assignable
{
  Assignable(const Assignable &);
  Assignable & operator=(const Assignable &);
};

// similarly...
concept LessThanComparable
{
  bool operator<(const LessThanComparable &);
};

// similarly...
concept EqualityComparable
{
  bool operator==(const LessThanComparable &);
};

// std::list requires that T is Assignable
template<class T, class A = allocator<T> >
concept<T>::Assignable
class list
{
public:
  // in addition, sort requires LessThanComparable...
  concept<T>::LessThanComparable
    void sort();

  // ...and remove requires EqualityComparable.
  concept<T>::EqualityComparable
    void remove(const T & val);
};

It's backwards-compatible with existing code by simply omitting the 'concept' keyword from your class or function definition.

Tim Serong
Tuesday, May 25, 2004

As you can seee, this argument has degenerated into Static/Dynamic typing, which is the real debate.  The real question is "Why Do We Need Static Typing."

Generics are needed to make static typing work.  If you have static typing, then you really need generics.

At the moment all of the Java containers require casts to add and remove elements.  When you add, you cast up to Object, and when you remove you cast down to whatever object you were expecting.

As soon as you use a cast, you have abandoned Static Typing.  When you say myCat = (Cat) e.nextElement() you are telling the compiler that the next element is a Cat.  It cannot tell what the next element is at compile time.  As has been pointed out above, this leads to runtime errors rather than compile time errors.

With generics you can explicitly say which types you would like the container to hold when you declare it.  The need for casts is removed and the compiler can statically test the activities using the container.

If you cannot see the use of this, then you really should switch to Python or Ruby.  It is a wonderfully liberating thing to do.

Ged Byrne
Tuesday, May 25, 2004

> If you're writing a template class or function yourself, just include a comment up the top that states T's requirements, and make sure you keep it up to date ;-)

Keeping the comments up to date almost NEVER works.  The main reason is the implementor of foo will not realize what concepts are required.  For instance I require a public default constructor in foo, and I purposely left that out wondering if anybody would notice.  If T does not have a public default contructor the user will get an error message
where myT is declared.  Unless the implementor of foo tries to instanciate it with a class that doesn't have a public default constructor, he or she may never realize that I have this requirement.  This is the fundamental problem.

Templates were invented to solve the problem that everyone returns to: making generic containers.  This is only a small portion of how templates are being used or abused in C++ today. 

Alexandrescu has done a lot of work in Policy based design, but in my opinion Policies are a solution looking for a problem.  I have seen some hideously difficult to maintain code that has made heavy use of templates.  In fact I might guilty of writing some, but I think I've learned my lesson. 

In the end the biggest problem facing software developers is not performance as proponents of Policy based design claim, but correctness and maintainability.  In my opinion templates are sufficiently messed up in C++ that you sacrifice both when using generic programming idioms.

christopher baus (www.baus.net)
Tuesday, May 25, 2004

Christopher Baus,

I still don't see how this isn't a compiler feature.  A good one surely, but I don't see it as a language feature.

In your example above, when the compiler is compiling foo, it makes a note that the type T needs to have a .bar() method.  Then, whenever anyone wants to instantiate foo, the compiler would check that the template parameter has a .bar() method and emit a useful error if not.

Since templates are required to be source code (ie, not compiled ahead of time), you know that your compiler has seen both the template code and the instantiator code, so it has all the information it needs to emit this error. 

But I don't see how we need anything else in the language to support it.  Unless you're talking about standardization of error messages across compilers?  That, however, is something I would not like to see...

Michael Kale
Tuesday, May 25, 2004

On concept checking - Java and C# generics have this notion. You can declare a List<T extends Bar>, meaning anything you put in the list needs to be a subclass of Bar. Because generics are so limited in Java and C#, there's little other specification needed (ok, so there's also <T super Bar>, but I'm not going to go into that).

Java uses type erasure, so for a pre-1.5 VM, the code appears to operate only on the Object class, with occasional casts inserted at the right places. The only difference in C# is that the compiler does code specialization for primitive types for performance improvements.

In C++, you can do a lot more intersting things with generics. What Christopher is referreing to is that there isn't a lot of compile time checking. For instance, the implementation of Foo<T> can call T::bar(); but there's no way to express this constraint. I recall seening a proposal to change the C++ standard to allow for constructs that describe constriants on template arguments.

igor
Tuesday, May 25, 2004

To answer the question with out all the language specfic bickering going on here,  generics allow code reuse.  Say you write a sort without generics you have to write one for every type that you want to sort.  With generics any type can be sorted as long as that type implements the ablilty to be compared(and compared in the way that you use in the sort).

Brian
Tuesday, May 25, 2004

> To answer the question with out all the language specfic bickering going on here...

Ok I'll appologize for the language specific bickering, but since C++ currently has the most widely used support of generics I think it is worth while to talk about generics from the perspective of C++.  If Java and/or C# get generics right, as much as I hate to say this, I think this will be another nail in the  C++ coffin.

christopher baus (www.baus.net)
Tuesday, May 25, 2004

"I think it would be a shame if generics were added to either Java or C# without learning this lesson and making concept specification and checking a requirement on all parameterized types. "

From what I've read about C# generics, the definition of a generic class requires specifications for what types of parameters it can take.  I don't remember exactly what kinds of restrictions you can make, but things like "classes implementing X interface" or "subclasses of X" are certainly among them.  I don't believe the C++ style restriction of "has some method named doit" is allowed, but I could be mistaken.

MikeMcNertney
Tuesday, May 25, 2004

"But...But...to play devil's advocate, isn't that what interfaces and polymorphic behaviour is for?  Instead of Dogs and Cats, they can all implement Animal, and when you call getType() or whatnot, it would call the actual underlying method of its actual type.  It seems to me like this is a perfectly acceptable solution. "

But what if you actually want a list of cats, and you know it is supposed to be a list of cats?  Allowing you to put Dogs into what you believe is a list of Cats is an error waiting to happen.

What if you want to do something with your list of Cats which isn't appropriate for other Animals?  You shouldn't put a method "playWithCatnip" in Animal because it isn't appropriate for Dogs or other subclasses.

And what if you want to have lists of two completely unrelated types?  Should you have to implement a separate type-safe list class for each one?

The point is that as others have said, when you have a static typed language, missing generics hurts your ability to do static type analysis.  Generics provide a way to keep static type analysis while also saving us from having to re-write collections (or other classes) for every type they could be used on.

MikeMcNertney
Tuesday, May 25, 2004

Ok, before we can have this debate, we need to clarify what you mean by "generics".  Do you mean .NET generics?  Java generics?  C++ templates?  Or generic programming in the abstract?  Because they are all very different, and it's hard to discuss the merits (and/or demerits) of "generics" without defining that term.

There are a great many differences between the C++ concept of templates and the .NET (2.0) concept of generics.    The Java concept of generics is again different, though I don't know much about them to properly speak on them.

See:
http://weblogs.asp.net/branbray/archive/2003/11/19/51023.aspx

For a good (though lengthy) comparison of the .NET generics and c++ templates. 

They are both powerful tools, but (IMHO of course!), c++ templates are much more useful than .NET generics.  One of the coolest things I'm looking forward to in Whidbey is the c++/CLI binding which will allow the use of real C++ *templates* with .NET classes.

Michael Kale
Tuesday, May 25, 2004

>See:
http://weblogs.asp.net/branbray/archive/2003/11/19/51023.aspx

This is really interesting article.  What I call lack of concept checking he calls lazy structural contraints. 

here's the relavant part from his article:


Templates use “lazy structural constraints”. What happens when a template relies on a function and it’s not there? When writing the template, the developer can call member functions on the template parameter, use operators, or call functions that use the template parameter. The definition of the template is kept by the compiler until later needed for a specialization. When the compiler creates a specialization, if a function call or an operator has no meaning for a particular parameter, then a compile-time error occurs. In short, the constraint for a template parameter is that it supports a particular operation (i.e. it has a function with a suitable overload or it has a valid operator). Template constraints are enforced at specialization rather than at definition.


I use the phrase “lazy structural constraint” with much liberty. The notion I am trying to get across is that the constraint is enforced lazily because compiler diagnostics appear at specialization. They are structural constraints because they require some kind of support from a type parameter that is not necessarily dependent on a subtype relationship.

christopher baus (www.baus.net)
Tuesday, May 25, 2004

Right.  I (think) I understand what you're talking about, but I don't see why this isn't something that can be implemented in the compiler without changes to the language.  When you're compiling:

template<typename T>
class foo
{
  T myT;
  void doStuff() {T.stuff();}
};

class bar {};
class baz { void stuff(); };

Then when you try to instantiate foo<bar>, the compiler knows that bar does not have a stuff() method, so it can emit a useful warning "bad instantiation of foo: typename bar does not define stuff()" or whatever.  Similarly, when you instantiate foo<baz>, the compiler knows that baz does define a stuff(), so it goes on ahead.

But no changes to the langage are necessary for this.  A compiler writer could do this today instead of something silly like "could not find bar::stuff()" or whatever silly message it spits out now.

What changes to the language specification would make this easier?  Or am I still misunderstanding what you're saying?

Michael Kale
Tuesday, May 25, 2004

make that myT.stuff() instead of T.stuff().  And I suppose if we're being pedantic, I need public: and some other stuff in there too .... sheesh, writing correct code the first time on a whiteboard, err weblog, is hard!

Michael Kale
Tuesday, May 25, 2004

One problem is the template is not instantiated until foo::doStuff() is called.

christopher baus (www.baus.net)
Tuesday, May 25, 2004

Oh, that's a good point.  If the calling code instantiates a template but doesn't fully utilize the methods of a class, then they're ok as long as the template parameter supports all the methods actually used. 

template <typename T>
class foo
{
  T myT;
public:
  foo(T t) : myT(t) {}
  void doStuff() { myT.stuff(); }
  void doOtherStuff() { myT.otherStuff(); }
};

struct bar
{
  void stuff();
};

bar b;
foo<bar> f(b);
f.doStuff();

This is ok.


When they make a seemingly innocent change to call a method that they hadn't called before it can blow up on them:

f.doOtherStuff();

Boom!


This is an interesting problem.  I would agree (tentatively at least, I haven't thought that much about it) that it would be good for compilers to attempt to compile all code in the template when foo<bar> is instantiated.

That said, I don't like the "where clause" of .NET generics.  I don't think that template parameters should be constrained to those that implement a particular interface.  I like the ad-hoc interface nature of c++ templates ... anything that implements a stuff() and otherStuff() should be able to be a template parameter for foo.  Not just things that happen to implement an IStuff interface.  My thoughts at least...

Michael Kale
Tuesday, May 25, 2004

I should make it clear that I didn't come up with the idea of concept checking.  In fact Jeremy Siek has done A LOT of admirable work in boost with the boost concept check library.  In fact it has gone a long way in making g++'s STL more useable. 

I feel that the library isn't the place to do this sort of thing, and it is going to be really difficult to fix the problem in the language, which leads me to conclude C++ has a serious problem.

I feel more strongly than most that the lack of concept checking is a significant problem in production code bases as the interface specification mechanism is completely lost.  On toy applications you don't notice the problems.  Or at very least they are easy to ignore.  It is when a project scales up to a few hundred thousand lines of code that the lack of maintainability shows through. 

Sometimes I look at the custom STL iterator code I wrote 5 years ago and wonder, "what the hell was I thinking?"  If you buy 100% into generic programming and the STL it is natural extension.  It is just horribly difficult to maintain in the long run. 

christopher baus (www.baus.net)
Wednesday, May 26, 2004

*  Recent Topics

*  Fog Creek Home