Fog Creek Software
Discussion Board

Search Tool Recommendations - Which One?


We are looking for a seach tool that's easy  - a number of websites plus some folders on a server that contain xls, pdf
and word documents. 

We do not have much resources nor technical experience - something simple and inexpensive would be nice.


Not a Savant
Monday, January 5, 2004 - seriously

Monday, January 5, 2004

Assuming that your website and shared docs are on a win2k box, use the Index Service. 

Open Computer Mgmt, Expand Services and Applications.  There are some Built in Catalogs there, but you can create a New Catalog and Point it to either your Website or you Docs Share (DON'T store the catalog in the same location as the store it's indexing!). 

Support for XLS and DOC files is built in.  You can download a PDF filter from Adobe

John Murray
Monday, January 5, 2004

I've recently done a lot of research in this area, comparing maybe a dozen software programs.  There's multiple tiers depending on your needs.

One one site I installed an excellent $40 package, the Fluid Dynamics search engine.  Handles up to 10,000 pages: HTML, PDF, and Word (the last two with add-ins).  Multiple sites are ok, customized search page etc.  Quite a bargain and does the basic stuff well.

The next level up is packages of $500 - $1500.  These packages offer higher-volume and some enhanced features.  (scripting, customized fields).  Thunderstone Webinator or dtSearch are both good contenders here.

There's a few programs for $5000 - $10,000 that offer significant flexibility and customization.  They typically are aimed at departments or single/small group of web sites.  Thunderstone Texis and Verity Ultraseek are two good examples. 

I installed an Ultraseek-based site for a client recently-- I really liked it.  Nice admin page for GUI-driven customization, ability to set up Yahoo-like directory searches, extensive Java API with ability to read/write directly to the search index.  It was also very good with custom fields (e.g. Author/Title/Publisher/etc) allowing them to be stored in document meta tags or a third party database.  And it handled every file type imaginable (HTML, Word, PPT, XLS, PDF was what I needed).

Finally, there's a fourth tier that's 50,000 and up.  (OpenText Basis, Documentum) for enterprise "document management" capability.  (e.g. working with content on many decentralized servers across a whole organization).

All of the above can be found via Google, but if you need more info, let me know.

A helpful resource  (especially regarding hosted search tools, which I did not investigate but are often quite inexpensive) is:

Monday, January 5, 2004

If the site is on the internet you could try:

Monday, January 5, 2004

*  Recent Topics

*  Fog Creek Home