Fog Creek Software
Discussion Board




Nice Search Engine

How different/ same is this from Google/ Teoma?

Prakash S
Tuesday, December 04, 2001

By this I mean the Search Engine on the left of the page

Prakash S
Tuesday, December 04, 2001

For one thing, it doesn't validate input very well:

Microsoft OLE DB Provider for SQL Server error '80040e14'

A clause of the query contained only ignored words.

C:\WWW\DISCUSS.FOGCREEK.COM\WEBSITE\JOELONSOFTWARE\..\include.asp, line 582

Johannes Bjerregaard
Tuesday, December 04, 2001

ha ha! :)

It's using SQL Server Full Text Search, which is terrible.

If somebody knows of a better way to do full text search on SQL Server data, I'd love to hear it.

Joel Spolsky
Tuesday, December 04, 2001


Nopes. In my old job, we had to licence a search engine, then export the HUGE database to a linux server, then reindex it, etc.

It was a hell to implement, but it runs very fast.

Leonardo Herrera
Tuesday, December 04, 2001


Oh, and Joel, don't blame the text search service for your faults in validating the input ;-)

Leonardo Herrera
Tuesday, December 04, 2001

No, I'm blaming it for being terrible.
For example when you give it a phrase it assumes "or", not "and". I can't figure out how to make it search for an exact phrase or do "and" by default...

Joel Spolsky
Tuesday, December 04, 2001

Here's a general text search technique that will work for any relational table, its more likely to be useful in Roman languages because of word/letter frequency but those fluent in other languages could probably tune it accordingly.

One of the problems of any full text search is that you rapidly get to a decreasing return over performance, either it takes ages to spit out a small set of rows or it spits out almost everything and you die from trying to handle the return.

Instead create a field on each appropriate table for a unique set of tags made up of between 4 and 5 characters of each word of text in the row.  The tags can be created on an insert and update trigger.

Then in the search routine clip the user's word entry to whatever the tag word size is and build the SQL query on just the tag field (applying whatever joins are appropriate).

Because of the way Roman/Germanic languages build words the first 4 or 5 letters are statistically going to match the user's intent.  There will be false returns but the proportion should be acceptably small.

Simon Lucy
Wednesday, December 05, 2001

To do an exact phrase search or specify Boolean operations you need to use CONTAINS rather than FREETEXT:

SELECT *
FROM tblWebLog
WHERE CONTAINS(Description, '"bad joke"')

SELECT *
FROM tblWebLog
WHERE CONTAINS(Description, '"Lego" AND "laser"')

Implementing a user interface for this is left as an exercise for the reader.

Mike Gunderloy
Wednesday, December 05, 2001

If you've got static files, perhaps you could try Index Server search? (though it gobbles resources)

MadMan
Wednesday, December 05, 2001

*  Recent Topics

*  Fog Creek Home