Fog Creek Software
Discussion Board




Why learn basic search and sort algorithms?

The post a couple posts ago about learning basic search and sort algorithms got me thinking.

I'm somewhat embarrassed to admit it (being a self-taught programmer), but in the work I've done I've never had any occasion at all to make direct use of any of the searching or sorting algorithms.  I don't deny that it would be nice to know them, or that they'd be interesting to learn, or that learning them would have some (indirectly) beneficial affect on the quality of my programming.

I have to admit, though, that given ever-present time and resource constraints I don't expect to be spending much time learning the basic search and sort algorithms. 

I just don't see where I'd want to use them.  I program database applications, mostly using Delphi, and any time that I need to perform searches or sorts they are already implemented for me as methods of whatever components or objects I'm using. 

At the level of the database itself, SQL does searches and sorts transparently to the programmer.  On a different level, I often load database data into datasets in memory.  I often search and sort those objects, but when I do I make use of searches or sorts that are already implemented within the dataset object (think ADO or ADO.NET recordsets). 

At a lower level I might have a need to sort or search an array or list of integers or strings, but even on those occasions I can load values into a basic list or stringlist object and once again the sorting and searching methods are built into the objects I'm using. 

I don't know whether the search and sort algorithms built into the components I use are the most efficient possible.  But, frankly, I don't much care.  I've yet to write a program where a simple search or sort ended up being a bottleneck.  (Some SQL queries, of course, can end up being unreasonably slow, but it's not like knowing basic search and sort algorithms is going to help me optimize those.) 

So I'm curious. 

For what sorts of things do programmers typically drag out their "basic search and sort algorithm" knowledge?  It must be at a level of abstraction lower than the ones I typically work at, since I don't see a need for it. 

Herbert Sitz
Monday, August 16, 2004

See, it's sort of like a dick measuring contest, only with hand-rolled data structures that are irrelevant to know about in any modern high level programming, since everything you need has already been invented and is either in the core language, or in a readily available library.

College educated programmers revel in them because they're taught about them ad nauseum by ancient professors who still think it's important.  It's sort of like learning about quantum physics by starting with basic gravitational acceleration equations to derive g.

muppet
Monday, August 16, 2004

Aren't basic sorting algorithms good to know when the high level implementation is too slow or inconvenient to use? 

.not
Monday, August 16, 2004

I've never implemented any variety of sort from scratch in my entire professional life.

However, I often use the skills that I developed while studying why some sort algorithms are good, and some are bad, and why some work better in certain situations.

I also learned a great deal about problem solving while studying assembly language. My third year linguistics class taught me a great deal, but I doubt I could ever apply any facts from that course directly to my work. I will never have a conversation in German, but my German course was very interesting.

Actually, forget that last one. I did get a few dates with the TA out it ;-)

Tim
Monday, August 16, 2004

Sorting is a great teaching tool because it's general and there are scalling issues that make it so that it's never completely obsolete.

Miles Archer
Monday, August 16, 2004

That's it. 

I can't take this shit anymore.  Joel take note.  This guy is driving people from your message board.  He turns every thread into a flame war. 

The board has reached a point where it isn't useful.  I think all successful groups eventually reach this point.  It is getting as bad as slashdot around here, which I stopped reading years ago when the "First Post" crap started... 

christopher (baus.net)
Monday, August 16, 2004

Why do we teach how to add with pencil and paper in a world with calculators?

Same thing.  No one learns how a sort algorithm works because they're going to be implementing a kick-ass sort algorithm at some point in their lives.  Very few developers actually do implement from-scratch sort algorithms because usually the underlying framework that they're using has one built in. 

But learning how to do so has two major advantages.  First, should the underlying framework NOT give you what you need, you're not stuck.  You can roll your own. 

Second, the skills you gain in learning how sort algorithms work generalize to other kinds of algorithms.  If you don't know what a HeapSort is and how it works, it's pretty darn hard to look at a non-sorting problem and realize that you could use a Heap as a data structure to solve the problem. 

Eric Lippert
Monday, August 16, 2004

I disagree.  You don't need to learn sort and search algorithms to learn the fundamentals of programming.  I'm perfectly capable of designing new (to me) solutions to heretofore unencountered (by me) problems, and I daresay that I write pretty efficient code for someone with my level of experience (about 5 years).  I understand the argument FOR teaching search and sort and the like as the 'basics', but frankly, I think that's a lot of handholding for people who don't 'get' programming natively, for lack of a better term.

There are those who eat, drink, and sleep code.  I'm one.  There's lots of others.  Those sort of people don't suffer in cases where they haven't been indoctrinated by conventional CS programs.  In fact, as I've said before, I think it's a distinct advantage not to have been.

The culture on this site and the responses I've garnered to this sort of statement have only reinforced that belief for me.

muppet
Monday, August 16, 2004

Well it can help to understand the fundamentals even if you use external search and sort mechanisms and after all in memory sorts of data sets of any size are actually going to be in a database of some kind these days.

Understanding how sorts and searches works helps you build indexes,  just as knowledge of set theory should give you the right order of joins and search clauses if you aren't using a DBMS that 'optimises' them for you.

Simon Lucy
Monday, August 16, 2004

What are you on about, Christopher? Are you whinging about muppet? If you are, I would prefer you left, not him.


Monday, August 16, 2004

Not being stultified by a degree in CS is one thing, using only your own experience as a yardstick for knowledge is an entirely different thing.

Simon Lucy
Monday, August 16, 2004

> You don't need to learn sort and search algorithms to learn the fundamentals of programming.

I did not I made that claim so I'm not sure what you're disagreeing with.

Systematic academic study is not the only road to knowledge and I have never claimed that it is.  It's just one of many that we have reasonable evidence works well for a lot of people.

Eric Lippert
Monday, August 16, 2004

All right well on that, we agree.

muppet
Monday, August 16, 2004

Database applications are "the common application" so-to-speak.  They are the staple of businesses and thus have the broadest range of effect on IT.  SQL and data handling libraries such as ADO make it trivial to manipulate the data.  It's common for people without degrees and people with associates degrees and some bachelors degrees to be hired into these positions for manufacturing and accounting and finance, etc... firms.  The abstraction is there because of the wide-spread use.

Now take a bachelor, master or PHD degreed person and they will be more likely to be working on commercial software applications that don't involve databases.  PhotoShop, QuarkXPress, InDesign, MS-Word, MS-Excel etc etc..  all require a higher degree of "finess" when it comes to coding them.  For example, when writing a spreadsheet, you don't have the luxury of using SQL to sort the rows.  When writing a word processor you don't have the luxury of predefined search routine to find a synonym in a library.

Of course it depends on the language and the libraries used. (C has quicksort, etc..)

I'm not sure what to term the class of applications that you would need to use or know this stuff but I would say "non-database related commercial applications?"

Level of abstraction is the key.  The lower you start the more you need to know.

(I'm not making a statement about education level here or that one degree or no degree is better than the rest.)


Monday, August 16, 2004

> re: muppet
> See, it's sort of like a dick measuring contest

Anybody that considers CS fundamentals a "dick measuring contest" has no place in this industry. 

I don't see how this guy thinks he can get any respect by insulting a good portion of the readership. 

I'm not saying that you should go out and implement a doubly linked list when std::list is available.  But I am saying you should understand how they are implemented. 

Computing is about data manipulation.  Data structures are commonly used to manipulate data.  These are the fundamentals of our trade.  It isn't a "dick measuring contest."  It is about having the knowledge necessary to do the job of a developer.

Muppet has a pretty crass way of making a point.  I just don't have any interest in this type of conversation anymore.  I don't find it the least bit constructive.   

christopher (baus.net)
Monday, August 16, 2004

No but you're still making a generalization, although you're probably on target as far as the trivial examples of search and sort are concerned.

Still, I don't think that working primarily with database-driven applications makes on programmer less skilled than one who deals primarily with memory/disk based apps.

Yes, coding without a DBMS means more burden on your code.  No, using a DBMS doesn't necessarily make you an unskilled or less intelligent developer.  I know how to deal with memory management, files on disk, and the like.  The first apps I wrote  (self-assigned learning projects) didn't use any DBMS, and I had to 'invent' a lot of the concepts that most guys learn in their CS programs.  I think that's a much more valuable education, but that's my opinion.

muppet
Monday, August 16, 2004

It becomes a "dick measuring contest" when a programmer's skill and ability comes down to who can design the most efficient doubly linked list, or who can write the most clever sorting algorithm for their grocery list.  Everyone's talents lie in slightly different patterns.  Because you peak here or there does not mean that I'll peak in the same places.  It doesn't make you better than me or vice versa (although with your attitude, christopher, I suspect that you'd be a poor coworker).

muppet
Monday, August 16, 2004

You learn it so you can answer stupid interview questions. I applied for a Perl job at Amazon. The first few interviews made me question what I was doing. They asked me webdev questions more related to html than anything else. Then I get to a guy who starts asking me programming questions. He asks me how I would sort something. Hello, this is Perl. You don't roll your own if you're doing Perl, the sort function, maybe using spaceship operators, yes caching is good ala schwartzian transform but no I don't write sorting algorithims as I usually work with SQL which sorts for me. Did you read my code? Are there any Perl programmers there or just jerks with CS degrees. 

Yet of course you learn stuff to learn. I think everyone ought to learn assembly. It's interesting to see all the same concepts which spring from there. And gosh, who knows, maybe you need to drop down to that level, oh yeah, right.

me
Monday, August 16, 2004

> although with your attitude, christopher, I suspect that you'd be a poor coworker

Ba Ha Ha Haa Ha....

Man take a look in the mirror someday. 

This isn't worth my energy.

Nazi's and Salad Cream.  I'm out of here.

christopher (baus.net)
Monday, August 16, 2004

Hmmm, even if you don't get any formal education in making carts you tend to learn how to make wheels from someone else that made a wheel and you tend to have to do very simple things over and over again before you get onto the complicated stuff like wheels.

Just because you believe you've managed to gain an adequate footing in your trade, muppet, and I've no particular reason to doubt it, it might strike some as pushing the limits of credibility if you felt that you knew it all, or to put it in a way which seems commensurate with your normal tone, enough to do very little and get paid for it, but know far more.

Personally, I find Herbert's approach more interesting.

Iconoclism is all very well, but, to paraphrase Bernard Levin, before you can consciously break the rules you have to avoid breaking them unconsciously.

Simon Lucy
Monday, August 16, 2004

Nobody knows it all, least of all me.  I just responded to the OP's question by saying that going through all of the textbook algorithms by rote is all well and good, but not the ONLY way to gain a solid foundation in programming.

muppet
Monday, August 16, 2004

There are two ways to gain technical knowledge and practical skills in our chosen profession:

(1) By a disciplined approach to learning, usually at a University.  Here is where you will slave away learning sorting algorithms as an undergrad, and maybe doing formal proofs of network routing algorithms as a Master's candidate.

(2) Writing code by the seat of your pants, and learning all you can along the way.

I submit that the ideal, most well-rounded developer, will have done both.  The developer who has done one or the other, but not both, will ultimately be deficit in one or more areas of expertise, maturity, and confidence.

You know who you are.

Mitch & Murray (from Downtown)
Monday, August 16, 2004

++usually at a University++

with the qualifier of "usually", I'll agree with that.

muppet
Monday, August 16, 2004

For me it was simple intellectual curiousity.  I, like muppet, am a self-taught developer and I can't imagine how I would've gotten where I am if I didn't want to learn things just for the sake of learning them.

I can guarantee one thing no matter who you are or what niche you're in: it can't hurt.  In my experience you'll notice that the basic techniques of good algorithm writing are universal, but even if you don't, even if that doesn't apply to you whatsoever, you won't be any worse off by knowing how to implement homegrown sorting.  Like someone else said a few posts ago--even if you never use it practice, it might make good conversation.

Shane Harter
Monday, August 16, 2004

As an "uneducated lout" I think you'd be better off teaching yourself the details of various compression algorithms (and then the accompanying "Information Theory").  Have you done this yet muppet?

Kalani
Monday, August 16, 2004

Shane -- I agree with what you say.  I think the algorithm stuff is kind of interesting and fun.  I've got a great book by a guy named Julian Bucknall called "Tomes of Delphi: Algorithms and Data Structures".  It really is a great book.  I plan to spend more time with it.

But I'm still not sold that studying up on "low-level" stuff like search and sort algorithms is the most effective use of my time.  Sure, if I had unlimited time I would love to get all the different algorithms down cold.  And I do see how learning them can have some beneficial effect on my programming skill. 

But I also have a limited amount of time.  Why not spend it on something that affects my ability to program more directly, like learning a really complex object model of a spiffy new component that operates at a very high level of abstraction and is going to do all sorts of work for me?

I've said before that I want to program at as high a level of abstraction as I can.  Ideally I would write no code at all, just glue high-level components together into an application by position them on screen and setting design-time properties.  That's not possible yet, at least using the tools I have.  I still need to code.  But I want to avoid coding at a low level of abstraction, and the search and sort algorithms are fairly low level.  I don't need them.

Once again, I'll say that I do realize that by not learning the search and sort algorithms I'm ignorant of some of the basic underpinnings of the code I write.  But I want to work at as high a level of abstraction as I can.  The whole idea of working at a high-level of abstraction is that I can get the benefit of low-level stuff without having to understand it.  I'm still not sold that it would be a wise use of my time to learn the search and sort algorithms, given that what I do is design and build basic end-user database applications.  (And, yes, I do know about Joel's idea of "leaky abstractions".  But so far I've been able to go lower-level when I have to, and it's never involved writing a search or sort algorithm to be used in one of my programs.)

I do plan to spend some more time with the Bucknall book I mentioned.  And I'll consider myself a more well-rounded programmer when I get up to speed on the various algorithms (and get more comfortable with pointers along the way!).  But I plan to do it because I think I'll enjoy it, not because I think it's going to have a big payoff in helping me write better programs.  Maybe it will have a bigger payoff than I think.  That would be nice, but I can't see it right now.

Herbert Sitz
Monday, August 16, 2004

+++'ve said before that I want to program at as high a level of abstraction as I can.  Ideally I would write no code at all, just glue high-level components together into an application by position them on screen and setting design-time properties.  That's not possible yet, at least using the tools I have.  I still need to code. +++


You're not a real programmer.  Go learn VB and quit trying to talk to the grown ups.  ;)

muppet
Monday, August 16, 2004

"The whole idea of working at a high-level of abstraction is that I can get the benefit of low-level stuff without having to understand it."

This is where I disagree. The idea of working at a high level of abstraction is the benefit of the "low-level stuff" without having to implement it over and over again. 

When you have QuickSort and BubbleSort options in your library how will you know which one to use?  Apply that across the board to compression (as someone mentioned before), encryption, you name it. 

How could you ever judge the *quaility* of the abstraction if you don't understand it's mechanics? 

Shane Harter
Monday, August 16, 2004

It depends if you care. If you don't care then fair enough. Not really much point learning this shit, as you point out.

If you care enough about it to become more than a mediochre programmer, if you want real understanding about what is going on behind the scenes, then go and learn it. Search and sort algorithms are a great introduction to the kind of thinking that'll make you a  great insightful programmer as opposed to just a database twiddling monkey (no offence)

Matt
Monday, August 16, 2004

Now and again, you can find lucrative applications that require searching and sorting and even some form of  database that cannot be ECONOMICALLY implemented with sufficient performance in off-the-shelf software by VBA plug'n'play programmers who think SQL can solve any stiff problem. Just a few:

Traffic control
Lotteries
Target acquisition
CAD
Weather
Imaging
Cryptoanything

Broaden your thinking.

trollop
Monday, August 16, 2004

The benefit of learning search and sort algorithms is not for the purpose of writing code that implements them.  The benefit is from understanding which data structure to use or which pre-built sorting implementation to choose.

T. Norman
Monday, August 16, 2004

Why is it black and white for you guys?  If I'm not a CS trained developer, then I'm a SQL jockey?  That's absolute rubbish.

muppet
Monday, August 16, 2004

I know why I learned it. It was fun.

I picked up the book "Data Structures and Algorithms in Java" and liked it so much I read it from cover to cover. As another "self-taugh programmer" I always feel behind in the discussions I have with colleagues. Not in the basic every day stuff. I can create architectures and designs for pretty much anything my current position requires. It's the 2% of the time, when you do have to understand that pulling back 1100 records instead of 1150 will increase performance because the database doesn't have to use another page. Or that the index on a table causes one of your joins to do a full table scan.

Knowing the concepts of trees and lists and sort algorithms has made me a better programmer. I used to think it made me a pretty darn good one. But for what I want to do long term - research, teaching, architecture - I wouldn't imagine trying to do it without a formal education. Not because I don't think I am smart enough, but because I don't feel I can offer as much without exposure to the concepts from a uni course.

And slightly off topic, but it is fascinating watching various posts degrade to flame wars so quickly. I guess such comes with popularity.

CF
Monday, August 16, 2004

I didn't read most of the replys here, as they mostly seemed to be about muppet in some way, so apologies if this is redundant.

For the vast majority of software development, you'll only be automating business logic and processes. CS theory is of marginal utility in those cases.

If you want to work on some really geeky stuff (like a 3D game engines, OS kernels, compilers, VMs, DSP, distributed systems, image recognition, etc) then a firm CS foundation is absolutely necessary. You need not get it in school, but you will need it. Learning sorting algorithms is just the start of that journey, there is a way way way more CS theory to learn beyond that. You can't learn it all, but you can't begin to know what you don't know until you start at the basics. I know it kind of sounds like Yoda talk, but its true.

I very much prefer the geeky stuff, but there are far more jobs to be had and easy money to be made writing business software.

ronk!
Monday, August 16, 2004

"Search and sort algorithms are a great introduction to the kind of thinking that'll make you a  great insightful programmer as opposed to just a database twiddling monkey (no offence) "

Sorry, but at this point I see the issue as not much different than the "real programmers use pointers" discussions that come up every so often on JoS.  I think it's pretty settled that real programmers don't necessarily use pointers.  (And if all real programmers do use pointers, then I'll take it as settled that real programmers don't necessarily produce better programs than non-real-programmers.)

Would I be a better programmer if I had a strong knowledge of search and sort algorithms.  Yes.  Does not having it make me a "database twiddling monkey".  Sorry, nope, it doesn't.  (And no offense, taken, I realize there's nothing personal here.)

A huge amount of the work involved in producing a database-driven business app is in the design.  How will the different parts of the program fit together?  What is the best user interface?  What object structures should I use?  How do I best leverage the high-level components I'm using?  Coding comes last, and in my opinion there are a lot of "code monkeys" out there who assume that they can code a great application, but because they don't have a good sense of overall design or because they just blast into coding too quickly, they find out that they can't.  Or, rather, other people find out that they can't.  Lots of the code monkeys think they've created great programs but they're actually pieces of crap to use.)

In fact, I'm sure that for the type of programs I write I can create them better and faster than lots of "bit twiddling real programmers" who can code circles around me.  If that's true, and I think it is, it's because I have a better sense of the overall design, even if I'm not as good at the low-level bit twiddling. 

Abstractions are admittedly leaky, but I have little doubt that for the applications I create the best way to make them is to program at as high a level of abstraction as possible.  When I come across a leak in my abstraction level, I have to be able to effectively delve down and plug it.  But by knowing my tools well and by spending time to design my applications to take advantage of my tools strengths and avoid their weaknesses, I can avoid troublesome leaks, and in so doing avoid writing huge amounts of code.  (In so doing I also avoid getting a huge mass of code that is difficult to debug and maintain.  Code monkeys may have fun with that, but not me.)

If I'm learning algorithms for fun, then that's all well and good.

But if I'm learning algorithms to help me in my job of making better programs, then I've got to decide on what's the most cost-effective use of my time.  Should I spend more time learning the ins and outs of my high-level tools that if used right can help me leverage the hell out of the time I spend building my application?  Or should I study search and sort algorithms that may help me a bit when I have to delve down below my usual high level of abstraction, but otherwise are of limited practical use? 

To anyone who still thinks you've got to know search and sort algorithms to be a "real programmer", can you really tell me that the answer really all that obvious?

Herbert Sitz
Monday, August 16, 2004

Oh yeah, I forgot to say, for employabilty, being a good software engineer is more important than being a good computer scientist.

By good software engineer, I mean things like can architect the code so that it is easliy maintained and extended, designs the system to be tested by an automated unit tests and system tests, writes clear, clean code and documentation, uses proper error handling and reporting, builds a good gui, etc.

None of those things really have anything to do with CS. The difference between Computer Science and Software Engineering is like the difference between Physics and Mechanical Engineering.

You can know all the theory, but if you code is crap then whats the point?

ronk!
Monday, August 16, 2004

>"The whole idea of working at a high-level of
>abstraction is that I can get the benefit of
>low-level stuff without having to understand it."

"This is where I disagree. The idea of working at a high level of abstraction is the benefit of the "low-level stuff" without having to implement it over and over again. "

Sorry, that just doesn't make sense to me.  Abstractions are not just a fancy way to make code reusable.  They do far more than that; they "abstract" out the need to understand the underlying concepts.

For example, in his leaky abstraction article Joel talks about the idea of a file system: 

"What is a file system? It's a way to pretend that a hard drive isn't really a bunch of spinning magnetic platters that can store bits at certain locations, but rather a hierarchical system of folders-within-folders containing individual files that in turn consist of one or more strings of bytes."

The file system does not abstract away from the idea of spinning magnetic platters with different charges storing information at different locations, just so that something doesn't have to be implemented again.  The file system abstracts away so that you don't even have to understand that there exists such a thing as a hard disk at all, and you can still have a fine workable understanding of a "file".  Millions of secretaries do that every day.  When abstractions become good enough, people can use the abstractions without having a whit of knowledge about what exactly is being abstracted.

Programming abstractions are not always at that level.  But in most of my programming with strings I simple deal with a String data type.  I don't have to know that the string is actually an array of bytes or (double-bytes) that is located at a certain position in the RAM of the computer.  I just use a String. 

Nope, abstraction is not primarily about making it easy to implement something lots of times.  It's about hiding low-level details so that you don't even have to know they're there.  Most programming abstractions are leaky, so you do in fact need to know about something beneath the abstraction.  But I'm not sure you need to know about as much down there as some of you "real programmers" think. 

Of course, it does always help to have lower level knowledge.  But there's some point at which gaining lower level knowledge stops being cost-effective.  I'm curious where that point is.

Herbert Sitz
Monday, August 16, 2004

Herbert (and others), there's more to searching than the "sort" method of your favorite container class.  In general, this is why people tell you to learn sorting/searching.

As an example, the problem of parsing a source file in your favorite programming language is equivalent to a type of search (and can be depth-first, breadth-first, etc).  So is visible surface determination in 3D game engines.  Understanding some things about properties of searches in general will help you to understand some aspects of things that don't obviously seem to be related to searching.

I suggested data compression earlier because it's not covered much by most "self taught" programmers, but general sorting/searching is also very important.

Also, I think that some people are mixing up an "academic" approach to CS with bit twiddling and understanding low-level details.  Yes it's important to know about cache misses and your virtual memory manager's paging system, but the "academic" approach typically has more to do with removing details from a problem that aren't necessary.  That's a very valuable talent to foster.  If you look at a big tangled mess and think "finite state machine with states X, transitions Y, etc" then you'll be able to handle hairier problems.  Academics "understand more" by simplifying the problem, that's their tradition.  Read a biography of Isaac Newton ("Never at Rest" is the best).

This argument about "self taught" versus "academic" practitioners is pretty tired, I think.  I'm self taught myself but I see a lot of value in the work of the academic community.  The people who get elitist about going to school (or not going to school) are just being petty (probably out of ignorance).  The most important thing is to study hard and follow every curiosity you've got while you've got time to do it.

Whatever your situation, stay away from people who'll prejudge you based on nothing but that situation (Eric Sink, etc), and try to find people who have the same principles of meritocracy that you do (who at the very least will judge you based on your actual work).

Kalani
Monday, August 16, 2004

First of all, I think it's useful to talk about optimizing for the common case. Needing to know an algorithm's properties is a fairly common case, but implementing it is uncommon. So a reasonable high-level knowledge, like "what is a hashtable's expected vs. worst-case complexity?" is useful.

Also, someone who can't quickly write a binary search offhand could look at a book and copy the algorithm. If you go by Knuth, it took 16 years between binary search's invention and the first correct published version! Programming Pearls has a whole chapter on its pitfalls.

Obviously all knowledge is resourceful to have. I'm only addressing overemphasis on implementing single-threaded imperative algorithms.

Bad educational materials is a problem. "The Art of Computer Programming" is seen more as "The Accounting of Computer Programming." I don't think it has shown many people where to find art in programming. To be strongly controversial, I would suggest people look at Russel/Norvig's AI book instead; chapter 3. Just a quick look, forgetting the hype of the term "AI". It explains where these algorithms come from, and kind of humanizes them. Even more importantly, it allows you to use the power of languages like Ruby and Python, rather than just being stuck in the C morass.
http://aima.cs.berkeley.edu/

Tayssir John Gabbour
Monday, August 16, 2004

Excuse me, that was Russell/Norvig's book, I misspelled Russell's name.
http://aima.cs.berkeley.edu/

Tayssir John Gabbour
Monday, August 16, 2004

I'm half self-taught, half formally educated. My impression is that while search/sort algorithms are rarely implemented by developers in the field, they serve as pretty good textbook examples.

They're to algorithms & data structure theory as factorial & fibonacci are to functional languages tutorials, but with some actual usefulness :P

I went trough an algorithm class in college. Here's what I remember from the sorting part of the course.

As far as I know, Bubble sort is some kind of sorting straw man, used to show O(n^2) behavior & to start off Algorithms 101 courses.

Quicksort, being O(n * log n) -- well, in the best case -- & recursive, is a lot more interesting. A big part of the fun is proving its complexity: for iterative algorithms like Bubble sort it's pretty easy, but things get tricky for recursive ones like Quicksort. Classic textbook example/exercise. Also, if you ever check out Haskell, one of the first programs you'll see will be this quicksort implementation:

qsort []    = []
qsort (x:xs) = qsort elts_lt_x ++ [x] ++ qsort elts_greq_x
  where
      elts_lt_x  = [y | y <- xs, y < x]
      elts_greq_x = [y | y <- xs, y >= x]

Heapsort is mostly used to show off heaps, what they are & when to use them. I can't quite remember what Mergesort was about.

We then moved on to dynamic programming, and finished the course with some graph theory & algorithms (search, depth-first, breadth-first).

I ended up with a D+. I hadn't turned in most of the assignments nor shown up in class for most of the semester. I eventually dropped off the program entirely: it was too mainstream for my taste (all C++ and no Lisp makes Homer something something).

I found what I wanted in computer science in books like SICP ( http://mitpress.mit.edu/sicp/ ) and CTM( http://www.info.ucl.ac.be/people/PVR/book.html  ):

alricb
Monday, August 16, 2004

I have implemented search and sorting algorithms in almost every application I have written.

Knowing how they work and their characteristics is important or you can't choose the best one.

Once upon a time there were some guys that put together a  setup taht could search many terabytes of info in milliseconds and then sort it according to likely percieved relevance to the user.

Thes guys are doing their IPO right now and their work has made them billionaires.

The Quiet Man
Tuesday, August 17, 2004

> I'm perfectly capable of designing new (to me) solutions

If you'd learned the algorithms you wouldn't need to do so.

If you don't know something exists, or in what cases it should best be used, you're going to have to do a lot of work of your own that you wouldn't otherwise need to do. Laziness is good.


Tuesday, August 17, 2004

Quicksort sorts Arrays in-place, as does BubbleSort.

Mergesort is good to use on linked lists.

And sometimes, you don't want to sort at all.  You just store your items in an automatically-sorted structure, like some sort of self-balancing tree.

Implementing search algorithms are very good programming exercises, especially in C.  The new programmer will get a lot of first hand experience with off-by-one errors and ArrayIndexOutOfBounds errors.

BubbleSort is the easiest to implement.  It's how most humans sort things, but humans will short-circuit the sort based on their birds-eye-view of the entire dataset.  If you ask someone who's never written a sort algorithm before to write a function to sort an array, BubbleSort is what they'll come up with.

    for (i = 0; i < array.Count; i++)
          for (j = 0; j < array.Count; j++)
              if (array[i] > array[j])
              { /* swap i and j */
                  tmp = array[i]
                  array[i] = array[j]
                  array[j] = array[i]
              }

Richard P
Tuesday, August 17, 2004

Herbert,

Learning the common searching and sorting algorithms is very cost-effective because it doesn't take that much time.  Pick up a good algorithms book and you'll learn them in a weekend or two.  You don't need to be able to formally prove their big-O complexity.  You don't even need to remember the details of every step of them, just know their general charactersitics so you can know what details to look for in a book when the time comes.

T. Norman
Tuesday, August 17, 2004

Isn't this why Joel wrote the leaky abstraction article? At some point your ignorance is going to inhibit you. I think the analogy of learning addition and multiplication is a good one. Why would you learn that when a calculator will always be quicker and more accurate?

antiparagon
Tuesday, August 17, 2004

"Isn't this why Joel wrote the leaky abstraction article?"

My point exactly. You need to know enough of the data structures to know when to use them. A lot of people always use arrays ( vectors ), and often a stack or an AVL tree would serve them better.

I inheritered of a program which dealed with an array of instances of an Employee class in Visual Basic. Not knowing how to sort locally, the previous programmers inserted each employee in a temp table in Access, made a SELECT with a ORDER BY clause, and populated the array with the good order.

Eric V.
Tuesday, August 17, 2004

For a small company of, say, 50 employees, even  bogosort may be good enough. ;-)

Miles Archer
Tuesday, August 17, 2004

Who needs sorting algorithms?

Just randomly scramble the data, and check if it's sorted.  Rinse. Repeat.  Eventually it will be in sorted order.

NoName
Tuesday, August 17, 2004

that's another reason to get some 'book learnin'. you can speak the same language as others, or even just express a concept in a word or two, freeing up your mind to look at more difficult combinations of concept.

cf. two posts right above this one, saying the same thing.

mb
Tuesday, August 17, 2004

>Just randomly scramble the data, and check if it's sorted.  Rinse. Repeat.  Eventually it will be in sorted order.

Also instead of folding you clothes, just chuck them in the clothes dryer, open the door form time to time. If you do this often enough you will eventually open the door to see a nice neat stack of folded washing....works every time! *wink*

Aussie Chick
Wednesday, August 18, 2004

Herbert,
IMHO I think you have asked the wrong people. Most programmers here come from a CS background (academic or self thaugh). Don't forget there is also Information Systems as a career. As far as I know, this people focus more on Business Software, high abstractions, and do less of algorithms/ low level abstractions.  Just see the difference between what they do and how they do it.
CS- program en C++, etc. program games, embedded, main worries are about performance, memory management, etc, etc.
IS- they use tools like you described, they drag components, etc, etc... they do more UML, etc, etc... main worries are about meeting client needs, business processes, etc, etc.  Maybe you can call them system analysts, although they also program in a way...

Mutant
Wednesday, August 18, 2004

My 2 Cents:

Sorting algorithms have a long history, are well understood, are easily understandable (both goal and process) and have three or four implementations with nice trade-offs in terms of space/size/efficiency.

As such, they are classic teaching tools for beginning algorithm/language students.  It's not really the "search and sort" algorithm you are trying to learn.  It's the language, how to apply the language, what makes things efficient and what doesn't.

These concepts are MUCH more easily understood when you have a nice example that demonstrates them.  That's why they continue to be taught.

Hopefully, knowledge about the decisions needed to select a sort algorithm, and the work needed to implement it, also introduces some humility in the student.  Then he won't go off and wield these off-the-shelf libraries with blissful ignorance, but instead with some engineering understanding.

AllanL5
Wednesday, August 18, 2004

+++It's not really the "search and sort" algorithm you are trying to learn.  It's the language, how to apply the language, what makes things efficient and what doesn't.
+++

Absolutely.  And you don't need a CS program, or indeed, extensive instruction in "search and sort", to accomplish this.

muppet
Wednesday, August 18, 2004

In 23 years, I've had to write a sort from scratch exactely one time, and that was 20 years ago.

I have had to write or modify searches a couple of time over the years.

I also remember solving a nasty performance problem with an application because I remembered (from my college course work) that the technique being used to create an  index was triggering a worst-case search.

Why is this relevant to the discussion ?

1. Because sometimes you really do have to know this stuff even if you'll never have to implement it from scratch.

2. Not eveyone works in the client/server database app world. The world is full of embedded systems and systems that can't use a typical database. In those cases one must  roll his own or implement an acquired library. Neither  of these options is feasible unless one understands the underlying principles.

Tom

Tom
Wednesday, August 18, 2004

Tom,

Absolutely.  And as a good programmer comes across these issues, he'll learn the necessary background to surmount the problem.  I still don't find a compelling argument here that CS indoctrination is necessary or even desirable unless you're simply not cut out to be a programmer but you're doing it anyway.

muppet
Wednesday, August 18, 2004

"And as a good programmer comes across these issues..."

The *good* part is important.  Other programmers may not even realize there is a problem, or may be barking up the wrong tree when the problem manifests itself.  They'll just blindly pick a sorting routine from a library, and everything is OK until a year or two later when the increased data volume slows the system to a crawl.

NoName
Wednesday, August 18, 2004

+++The *good* part is important.  Other programmers may not even realize there is a problem, or may be barking up the wrong tree when the problem manifests itself.  They'll just blindly pick a sorting routine from a library, and everything is OK until a year or two later when the increased data volume slows the system to a crawl.+++

Right, and those 'other' programmers are the guys who need a CS program.

muppet
Wednesday, August 18, 2004

Muppet you are probably more CS indoctrinated than some people here, the thing is that you did it yourself not at school… but, what’s the difference?
You didn’t need formal education to develop your “natural” skills, so what! You still needed some practice, you still needed to read books, and so… that looks pretty similar to me.

Muppet’sFairyMother
Thursday, August 19, 2004

MFM -

There's a huge philosophical and psychological difference.  Because I learned on my own, I learned (as close to) without bias as possible.  The only biases I was exposed to are the ones the authors of the books I chose held.  I tried my best to find viewpoints from all sides, which is why I know a few core languages very well, and then I know a little bit about quite a few other languages that I don't use regularly, but I know their strengths.

In a CS program you are indoctrinated by a group of professors who likely have the same "political" agenda (as far as software geeks get political), because after all, they were all a fit for the organization in which they teach.

Also, instead of learning how to do a binary sort, I learned critical thinking skills (by being forced to figure things out on my own) that allow me to derive these sorts of algorithms without having them handed to me.  This, I think, gives me a much freer and complete association in my mind and allows me to piece things together differently and more agiley than a college trained programmer.

Of course, these are only my opinions.  I do own one book on common algorithms (specifically, in Perl), but I didn't pick that one up until last year, mostly as a lark to see what sort of things I may not have thought of.  Of course there is advantage in standing on the shoulders of giants, but I think you ought to have to climb there, first.

muppet
Thursday, August 19, 2004

*  Recent Topics

*  Fog Creek Home