Fog Creek Software
Discussion Board




Some Questions

(1) How are collection objects in high level languages implemented internally? As link lists, I would guess? Is there not a performance hit with that? I would expect a Dictionary to be faster than a link list. If they are nothing but link lists with some meat on, then they must be slower than the link lists. K&R as well as Joel's Back to Basics article describes the disadvantages of malloc in terms of performance. Because malloc internally implements two link lists to used memory and the free chain, as the number of items in a collection grows, the Pop() must get more expensive.

Do we actually have implementations of link lists in our programs? We mostly use high level data structures like Collections, Hash tables, Stacks and Queues. Could you please provide me with a real life example where a link list implementation would be indispensible? Or would it just be re-inventing the wheel?

(2) Last week, a gay activist was being interviewed on a news channel and he was introduced as a puppeteer. Since puppet also has a homo-sexual connotation, I wanted to know if the word puppeteer also has a meaning besides the literal meaning indexed in the Webster's dictionaries. I didn't find anything here.

(3) While rummaging through a C programming book, I found some examples of matrices and determinants. I understand them, but I can't think of a situation where they could be applied in real life situations. Could someone please tell me why I would need a matrix in solving a real life problem?

(4) Good News: I jumped for joy when I heard last week on TV that The Apprentice was coming to India on Star World.


Apologies if my questions sound dumb.

Sathyaish Chakravarthy
Wednesday, August 25, 2004

3) Matrices can be used to solve mathematical network problems (that is one use).

I know matrices are also heavily used in EE, not sure what for though

Aussie Chick
Wednesday, August 25, 2004

If you really want to know the applied side of matrices, go look at Mathworks site, the company that makes Matlab: http://www.mathworks.com/

Yet another anon
Wednesday, August 25, 2004

This doesn't directly answer your question but talks about how perl internally implements lists and strings using normal C functions: http://www.perl.com/pub/a/2001/06/27/ctoperl.html

Matthew Lock
Wednesday, August 25, 2004

Matrices can be used to solve sets of linear algebraic equations.  This comes in handy when displaying 3D graphics on a computer, for example.

You multiply the set of points that make up the 3D object model through the World, View and Projection matrices and viola.


Wednesday, August 25, 2004

"(1) How are collection objects in high level languages implemented internally? "

I haven't looked at the code, but Perl originally called an associative array (or Dictionary) a "hash"; it was implemented via a hashing algorithm into an array, so it was very fast.

Tom H
Wednesday, August 25, 2004

>Matrices can be used to solve sets of linear algebraic equations.  This comes in handy when displaying 3D graphics on a computer, for example.

Cool! Graphics. Makes sense!

Sathyaish Chakravarthy
Wednesday, August 25, 2004

As far as linked lists and collections go - when I used to develop in C, I implemented linked lists whenever I needed to store arbitrarily large ordered lists of items. I generally stopped creating my own linked list managers when collections became available as standard classes.

I'm pretty sure that collections in most class libraries are implemented as linked lists.  The only other alternative for such storage characteristics is an array, but you'd have to shuffle elements internally to implement insertion. I suspect that the linked list is the lowest-overhead way known to implement this storage behavior.

You can always download the standard template library to see how it's done there.

And, collections are never "part of" the language - they are always an add on class library.

Bored Bystander
Wednesday, August 25, 2004

BB,


>And, collections are never "part of" the language - they are always an add on class library.

Yes, I didn't mean that either. What I meant was that as a developer of a high-level language, they are pretty much _given_ to us.



>You can always download the standard template library to see how it's done there.

Where can I do that? I don't know much about C++, actually. Just a little bit.



>I suspect that the linked list is the lowest-overhead way known to implement this storage behavior.

That's exactly the debate. With an array, you have a memory constraint. If you want to use realloc, you're in turn crippled with the same flaws that malloc has. And if you don't use the array, you use the link list. So, do I have it right then that the high level objects have no significant advantage over link lists, except that in an environment like .NET, they are managed for asynchornous access and stuff.

Sathyaish Chakravarthy
Wednesday, August 25, 2004

"Could someone please tell me why I would need a matrix in solving a real life problem?"

Virtually every product you buy has been manufactured with machine tools and robots that use matrices for 3D motion control.

sgf
Wednesday, August 25, 2004

The advantage is that we don't have to keep reinventing the wheel to use high level data structures over and over again.

Whether they are in a library like STL or part of the language like perl I would much rather focus on what's unique about my problem rather than build my own wobbly dynamic arrays.

Matthew Lock
Wednesday, August 25, 2004

I honestly don't know what you're asking or debating. But being a masochist I'll take a stab.

>> >I suspect that the linked list is the lowest-overhead way known to implement this storage behavior.

>> That's exactly the debate. With an array, you have a memory constraint. If you want to use realloc, you're in turn crippled with the same flaws that malloc has. And if you don't use the array, you use the link list.

As far as dynamic memory having inefficiency - for pete's sake, if you want to have an abstraction like "pool of memory" available as a tool, you need to tolerate some inefficiency. Every convenience has a price. As memory is used and recycled, a pool of memory tends to fragment and this makes allocation times potentially longer.

Secondly - linked list classes typically preallocate a pool of their own storage units of fixed size, in order to minimize the overhead of hitting malloc() every time the list is changed. I've seen this in MFC and I've seen it in the Delphi class libraries. If each link is, say, 20 bytes in size, then a container class may pre-allocate storage for 1024 links (for instance) and dole out links from this pool. As the pool is exhausted, another block of link storage is allocated.

As far as run time efficiency, this is rather obvious. A linked list has a roughly invariant speed per insertion. With an array, adding to the end of the array - only- is a low overhead operation. Inserting an item to the middle of an array will have a processing time associated with it that is proportional to the number of entries already above it in the list that need to be relocated. Because all the entries above it need to be copied upward. This is obvious. Same problem in reverse with arbitrary list deletions; memory has to be compacted downward.

Example: text editors and word processors are almost always implemented with a linked list for storage of the text data. The user's keystrokes in the middle of a large document should not cause a "freeze" on every character entry.

There are basically 'reasons' associated with choosing each type of storage. If I am only going to add entries and never insert entries into a list, then my best choice of storage is a memory array.

>> So, do I have it right then that the high level objects have no significant advantage over link lists, except that in an environment like .NET, they are managed for asynchornous access and stuff.

Basically, that's right. Because of how memory "is". The "asynchronous access" is to protect the variables that keep the list from being garbaged by code that interrupts another thread that is also manipulating the list.

Bored Bystander
Wednesday, August 25, 2004

Thanks for the spirit, BB. You've actually answered all that I had to ask. I am going to take your write home and yummy...feast, feast, feast, it'll be.

I am sure I'll be thinking more clear in a few hours. There are a lot of new things for me to learn in this post of yours. The word processing example, it's going to be a new perspective for me to think about linked lists; something I've thought about but have only upto a point where I had no answers and no one to ask. And the pool thing about reserving extra memory, sounds interesting.

The problem with me is that when I hear something interesting, many more interesting things start to spin around in my head. Your post triggered things like garbage collection and stuff, but I now right now I am not in the right head. I've been awake too long. I am not closing this thread. Will refresh my memory and will come back.

Thanks!

Sathyaish Chakravarthy
Wednesday, August 25, 2004

linked lists and arrays can be combined to give a meaningful synergy.

A linked list of arrays, segmented arrays in Symbianspeak, are widely used in embedded spaces where both memory and performance are critical.

i like i
Thursday, August 26, 2004

1)  Depending on the Collection, some will be implemented as linked lists, but others will be implemented as arrays (many times Vectors, for example, will be implemented as arrays, with a realloc happenning when it needs more space).  It slows down insertion into the middle (or beginning) of the Collection, but it greatly speeds up access/retrieval of arbitrary elements.

Of course, the reason these are provided in the Standard Library is so you don't have to worry about how  they're implemeted.  :)

2) "Puppetteer" does not have any colloqueal meaning, as far as I know ... just what's in the dictionary.  (I'm in the U.S.)

One of the Matts
Thursday, August 26, 2004

Matrices are used in specifying ways to solve all manner of optimization problems. Find some info on linear programming. This will get you started. There are other methods that solve quadratic/non-linear problems too.

I apologize if I'm insulting your intelligence.

A linear program would be something like:

Maximize:

3.5x + 4y + .02z

Subject to:

x + y + z < 100
5x + 3y < 60

Rob VH
Thursday, August 26, 2004

"Since puppet also has a homo-sexual connotation..."

What about muppet?

straight as an arrow
Thursday, August 26, 2004

> I would expect a Dictionary to be faster
> than a link list.

As with all performance questions you have to specify what you mean by "faster" -- faster under what conditions?

The Scripting.Dictionary object is implemented as a hashing-with-chaining table -- it is basically an array of linked lists, and we decide what list it goes on based on the data.  Since the largest asymptotic cost in a linked list is typically searching the list, this cuts down on the search time by a factor of the list-array size.

The .NET dictionary collections are extensible hashing with chaining -- when the lists get too long, they re-hash the table into a larger number of buckets.  That way the linked lists never get too long.

For some discussion of various issues, see my blog:

http://blogs.msdn.com/ericlippert/archive/0001/01/01/132160.aspx

You might want to read Rico Mariani's web log as well.

Eric Lippert
Thursday, August 26, 2004

> Could someone please tell me why I would
> need a matrix in solving a real life problem?

Matrices are used all over the place, but two common real-world usages are the solution of systems of equations -- linear equations, differential equations, etc, all use matrices -- and rigid transformations of objects. 

For example, if you have a 3-d coordinate system and you have coordinates for the corners of an object, and you want to rotate that object and then slide it four units north, you can represent that operation by a series of matrix multiplications on the coordinates.

Eric Lippert
Thursday, August 26, 2004

What about question 2?

Throwaway
Friday, August 27, 2004

Thanks, Eric! I am a little bit drunk at the moment, so I really don't understand anything. I'll get back later.

Oh beer!

Sathyaish Chakravarthy
Friday, August 27, 2004

*  Recent Topics

*  Fog Creek Home